Mid Sweden University

miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models
Show others and affiliations
2021 (English)In: IEEE Access, E-ISSN 2169-3536, Vol. 9, p. 3545-3556, article id 9306831Article in journal (Refereed) Published
Abstract [en]

With new accelerator hardware for Deep Neural Networks (DNNs), the computing power for Artificial Intelligence (AI) applications has increased rapidly. However, as DNN algorithms become more complex and optimized for specific applications, latency requirements remain challenging, and it is critical to find the optimal points in the design space. To decouple the architectural search from the target hardware, we propose a time estimation framework that allows for modeling the inference latency of DNNs on hardware accelerators based on mapping and layer-wise estimation models. The proposed methodology extracts a set of models from micro-kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation. We test the mixed models on the ZCU102 SoC board with Xilinx Deep Neural Network Development Kit (DNNDK) and Intel Neural Compute Stick 2 (NCS2) on a set of 12 state-of-the-art neural networks. It shows an average estimation error of 3.47% for the DNNDK and 7.44% for the NCS2, outperforming the statistical and analytical layer models for almost all selected networks. For a randomly selected subset of 34 networks of the NASBench dataset, the mixed model reaches fidelity of 0.988 in Spearman's $\rho $ rank correlation coefficient metric. © 2013 IEEE.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc. , 2021. Vol. 9, p. 3545-3556, article id 9306831
Keywords [en]
Analytical models, estimation, neural network hardware, Deep neural networks, Mapping, System-on-chip, Estimation errors, Estimation models, Hardware accelerators, Rank correlation coefficient, Roofline models, State of the art, Target hardware, Time estimation, Neural networks
Identifiers
URN: urn:nbn:se:miun:diva-43424DOI: 10.1109/ACCESS.2020.3047259ISI: 000606552900001Scopus ID: 2-s2.0-85098756988OAI: oai:DiVA.org:miun-43424DiVA, id: diva2:1604109
Available from: 2021-10-18 Created: 2021-10-18 Last updated: 2021-10-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Jantsch, A.
In the same journal
IEEE Access

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 4 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf