Open this publication in new window or tab >>Show others...
2023 (English)In: 2023 IEEE Sensors Applications Symposium (SAS), 2023Conference paper, Published paper (Refereed)
Abstract [en]
Implementing Convolutional Neural Networks (CNN) based computer vision algorithms in Internet of Things (IoT) sensor nodes can be difficult due to strict computational, memory, and latency constraints. To address these challenges, researchers have utilized techniques such as quantization, pruning, and model partitioning. Partitioning the CNN reduces the computational burden on an individual node, but the overall system computational load remains constant. Additionally, communication energy is also incurred. To understand the effect of partitioning and pruning on energy and latency, we conducted a case study using a feet detection application realized with Tiny Yolo-v3 on a 12th Gen Intel CPU with NVIDIA GeForce RTX 3090 GPU. After partitioning the CNN between the sequential layers, we apply quantization, pruning, and compression and study the effects on energy and latency. We analyze the extent to which computational tasks, data, and latency can be reduced while maintaining a high level of accuracy. After achieving this reduction, we offloaded the remaining partitioned model to the edge node. We found that over 90% computation reduction and over 99% data transmission reduction are possible while maintaining mean average precision above 95%. This results in up to 17x energy savings and up to 5.2x performance speed-up.
Keywords
CNN, IoT, Partitioning, Pruning, Quantization, Tiny YOLO-v3
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:miun:diva-49648 (URN)10.1109/SAS58821.2023.10254054 (DOI)2-s2.0-85174060733 (Scopus ID)9798350323078 (ISBN)
Conference
2023 IEEE Sensors Applications Symposium, SAS 2023
2023-10-242023-10-242023-10-24Bibliographically approved