Mid Sweden University

miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Big Data analytics for the forest industry: A proof-of-conceptbuilt on cloud technologies
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
2016 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Large amounts of data in various forms are generated at a fast pace in today´s society. This is commonly referred to as “Big Data”. Making use of Big Data has been increasingly important for both business and in research. The forest industry is generating big amounts of data during the different processes of forest harvesting. In Sweden, forest infor-mation is sent to SDC, the information hub for the Swedish forest industry. In 2014, SDC received reports on 75.5 million m3fub from harvester and forwarder machines. These machines use a global stand-ard called StanForD 2010 for communication and to create reports about harvested stems. The arrival of scalable cloud technologies that com-bines Big Data with machine learning makes it interesting to develop an application to analyze the large amounts of data produced by the forest industry. In this study, a proof-of-concept has been implemented to be able to analyze harvest production reports from the StanForD 2010 standard. The system consist of a back-end and front-end application and is built using cloud technologies such as Apache Spark and Ha-doop. System tests have proven that the concept is able to successfully handle storage, processing and machine learning on gigabytes of HPR files. It is capable of extracting information from raw HPR data into datasets and support a machine learning pipeline with pre-processing and K-Means clustering. The proof-of-concept has provided a code base for further development of a system that could be used to find valuable knowledge for the forest industry.

Place, publisher, year, edition, pages
2016. , p. 70
Keywords [en]
Big Data analytics, Apache Spark, StanForD 2010, forest industry, harvest production report
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:miun:diva-28541Local ID: DT-V16-A2-005OAI: oai:DiVA.org:miun-28541DiVA, id: diva2:952906
Educational program
International Master's Programme in Computer Engineering TDAAA 120 higher education credits
Supervisors
Examiners
Available from: 2016-08-16 Created: 2016-08-16 Last updated: 2018-01-10Bibliographically approved

Open Access in DiVA

fulltext(2459 kB)836 downloads
File information
File name FULLTEXT01.pdfFile size 2459 kBChecksum SHA-512
09495c7e391c9f511ea2eba162f6bb24c349d2b77480f9eb6e2579085862977e82b85d3ecf1afca7afedaf74aae8fe1b39397c9b7137b9a5d136860a65dcee84
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Sellén, David
By organisation
Department of Information and Communication systems
Computer Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 836 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 3056 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf