Mid Sweden University

miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Processing chain for 3D histogram of gradients based real-time object recognition
Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
2021 (English)In: International Journal of Advanced Robotic Systems, ISSN 1729-8806, E-ISSN 1729-8814, Vol. 18, no 1, article id 1729881420978363Article in journal (Refereed) Published
Abstract [en]

3D object recognition has been a cutting-edge research topic since the popularization of depth cameras. These cameras enhance the perception of the environment and so are particularly suitable for autonomous robot navigation applications. Advanced deep learning approaches for 3D object recognition are based on complex algorithms and demand powerful hardware resources. However, autonomous robots and powered wheelchairs have limited resources, which affects the implementation of these algorithms for real-time performance. We propose to use instead a 3D voxel-based extension of the 2D histogram of oriented gradients (3DVHOG) as a handcrafted object descriptor for 3D object recognition in combination with a pose normalization method for rotational invariance and a supervised object classifier. The experimental goal is to reduce the overall complexity and the system hardware requirements, and thus enable a feasible real-time hardware implementation. This article compares the 3DVHOG object recognition rates with those of other 3D recognition approaches, using the ModelNet10 object data set as a reference. We analyze the recognition accuracy for 3DVHOG using a variety of voxel grid selections, different numbers of neurons (N-h ) in the single hidden layer feedforward neural network, and feature dimensionality reduction using principal component analysis. The experimental results show that the 3DVHOG descriptor achieves a recognition accuracy of 84.91% with a total processing time of 21.4 ms. Despite the lower recognition accuracy, this is close to the current state-of-the-art approaches for deep learning while enabling real-time performance.

Place, publisher, year, edition, pages
2021. Vol. 18, no 1, article id 1729881420978363
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:miun:diva-41626DOI: 10.1177/1729881420978363ISI: 000619537100001Scopus ID: 2-s2.0-85099946553OAI: oai:DiVA.org:miun-41626DiVA, id: diva2:1537169
Available from: 2021-03-15 Created: 2021-03-15 Last updated: 2025-02-07
In thesis
1. Semi-Autonomous Navigation of Powered Wheelchairs: 2D/3D Sensing and Positioning Methods
Open this publication in new window or tab >>Semi-Autonomous Navigation of Powered Wheelchairs: 2D/3D Sensing and Positioning Methods
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Autonomous driving and assistance systems have become a reality for the automotive industry to improve driving safety in the car. Hence, the cars use a variety of sensors, cameras and image processing techniques to measure their surroundings and control their direction, braking and speed for obstacle avoidance or autonomously driving applications.Like the automotive industry, powered wheelchairs also require safety systems to ensure their operation, especially when the user has controlling limitations, but also to develop new applications to improve its usability. One of the applications is focused on developing a new contactless control of a powered wheelchair using the position of a caregiver beside it as a control reference. Contactless control can prevent control errors, but it can also provide better and more equal communication between the wheelchair user and the caregiver

This thesis evaluates the camera requirements for a contactless powered wheelchair control and the 2D/3D image processing techniques for caregiver recognition and position measurement beside the powered wheelchair. The research evaluates the strength and limitations of different depth camera technologies for caregiver feet detection above the ground plane to select the proper camera for the application. Then, a hand-crafted 3D object descriptor is evaluated for caregiver feet recognition and compared with respect to a state-of-the-art deep learning object detector. Results for both methods are good, however, the hand-crafted descriptor suffers from segmentation errors and consequently, their accuracy is lower. After the depth camera and image processing techniques evaluation, results show that it is possible to use only an RGB camera to recognize and measure his or her relative position.

Place, publisher, year, edition, pages
Sundsvall: Mid Sweden University, 2021. p. 64
Series
Mid Sweden University doctoral thesis, ISSN 1652-893X ; 354
Keywords
3D object recognition, YOLO, YOLO-Tiny, 3DHOG, Histogram-of-Oriented-Gradients, ModelNet40, Feature descriptor, Intel RealSense, Depth camera, Wheelchair
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:miun:diva-43829 (URN)978-91-89341-32-6 (ISBN)
Public defence
2021-12-09, O102, Mittuniversitetet, Sundsvall, 16:06 (English)
Opponent
Supervisors
Note

Vid tidpunkten för disputationen var följande delarbeten opublicerade: delarbete 5 inskickat.

At the time of the doctoral defence the following papers were unpublished: paper 5 submitted.

Available from: 2021-11-24 Created: 2021-11-23 Last updated: 2025-04-24Bibliographically approved

Open Access in DiVA

fulltext(1024 kB)836 downloads
File information
File name FULLTEXT01.pdfFile size 1024 kBChecksum SHA-512
4e3d4bb3dea43373102933dc83c9051b7cb96f0ec766a036b6db4d50181f7c1062e53d824f3c0d50b733dcef0634d765ceb136d246d29833a254074f0c7160b7
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Vilar, CristianKrug, SilviaThörnberg, Benny

Search in DiVA

By author/editor
Vilar, CristianKrug, SilviaThörnberg, Benny
By organisation
Department of Electronics Design
In the same journal
International Journal of Advanced Robotic Systems
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar
Total: 836 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 244 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf