Mid Sweden University

miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Optimizing ESRGAN for Mobile Deployment: Enhancing Image Super-Resolution on Android Devices
Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
2024 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [sv]

Rapporten presenterar det arbete som utfördes för en kandidatuppsats i ämnesområdet datavetenskap. Den ursprungliga uppgiften var att undersöka hur djupinlärningsarkitekturen ESRGAN, som används för superupplösning, kan komprimeras så att minimal precision förloras. Projektet resulterade i utvärderingen av tre optimeringsmetoder; dynamic range, full integer och float16-kvantisering. Mätningarna utfördes med hjälp av två mobila enheter; en Samsung Galaxy S9+ surfplatta och en S10+ Android-telefon. Mätningarna genomfördes med hjälp av mätvärdena inferenstid, PSNR, SSIM och kompressionsförhållande. Resultaten visade att Dynamic Range hade en avsevärt långsammare inferenstid jämfört med Full Integer och Float16-kvantisering. Dynamic Range hade ett validerings-PSNR på 27.0 och ett test-PSNR på 22.3. De resulterande SSIM-värdena var 0.81 för valideringsdatasetet och 0.67 för testdatasetet. Full Integer slutade med PSNR-värdena 26.3 och 21.9 för validering respektive test. När det gäller SSIM fick Full Integer poängen 0.77 (validering) och 0.64 (test). Slutligen genererade Float16 PSNR-värdena 27.1 och 22.3, samt SSIM-värdena 0.81 och 0.67. PSNR- och SSIM-utvärderingarna visade att de komprimerade modellerna behövde mer kalibrering för att uppnå högre poäng i dessa metoder, och således högre noggrannhet.

Abstract [en]

This report presents the work that was carried out for a bachelor’s thesis in computer science. The original task was to investigate how the deep learning architecture ESRGAN used for super resolution can be compressed such that minimal accuracy is lost. The project resulted in the evaluation of three optimization methods; dynamic range, full integer, and float16 quantization. Dynamic range quantizes the weights of the neural network into 8 bits of precision, full integer quantizes all floating point parameters, and float16 reduces halves the floating point precisions. The benchmarks were performed using two mobile devices; a Samsung Galaxy S9+ tablet and an S10+ android phone. Measurements were conducted using metrics inference time, PSNR, SSIM, and compression ratio. The results showed that Dynamic Range had a significantly slower inference time compared to Full Integer and Float16 quantization. Dynamic range had the validation PSNR score of 27.0, and a testing PSNR score of 22.3. The resulting SSIM values were 0.81 for the validation dataset and 0.67 for the testing dataset. Full integer ended up with the PSNR scores 26.3, 21.9 for validation and testing respectively. As for SSIM, Full integer brought the scores 0.77 (validation) and 0.64 (testing). Finally, Float16 generated PSNR scores 27.1 and 22.3, and the SSIM scores 0.81 and 0.67. The PSNR and SSIM evaluations showed that the compressed models needed more calibration for a higher score in these metrics, and consequently a higher level of accuracy.

Place, publisher, year, edition, pages
2024. , p. 53
Keywords [en]
ESRGAN
Keywords [sv]
ESRGAN
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:miun:diva-51626Local ID: DT-V24-G3-040OAI: oai:DiVA.org:miun-51626DiVA, id: diva2:1875766
Subject / course
Computer Engineering DT1
Educational program
Computer Science TDATG 180 higher education credits
Supervisors
Examiners
Available from: 2024-06-24 Created: 2024-06-24 Last updated: 2024-06-24Bibliographically approved

Open Access in DiVA

fulltext(5823 kB)211 downloads
File information
File name FULLTEXT01.pdfFile size 5823 kBChecksum SHA-512
6e79f54a953e0421890c78f23de1d79175cf9eba40699fd7b3e9dcf631a861c131a6f69569027b4a8a1f4f447fca34f96a661feb640bf7d777bf454fcf615406
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Fredin, Arvid
By organisation
Department of Computer and Electrical Engineering (2023-)
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 211 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 311 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf