Mid Sweden University

miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
An artificial neural network-based system for detecting machine failures using a tiny sound dataset: A case study
Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design. Assa Abloy, Group Technology Team, Stockholm, Sweden. (STC)
Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design. (STC)ORCID iD: 0000-0002-8382-0359
Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design. (STC)
2022 (English)In: Proceedings - 2022 IEEE International Symposium on Multimedia, ISM 2022, IEEE conference proceedings, 2022, p. 163-168Conference paper, Published paper (Refereed)
Abstract [en]

In an effort to advocate the research for a deep learning-based machine failure detection system, we present a case study of our proposed system based on a tiny sound dataset. Our case study investigates a variational autoencoder (VAE) for augmenting a small drill sound dataset from Valmet AB. A Valmet dataset contains 134 sounds that have been divided into two categories: "Anomaly"and "Normal"recorded from a drilling machine in Valmet AB, a company in Sundsvall, Sweden that supplies equipment and processes for the production of biofuels. Using deep learning models to detect failure drills on such a small sound dataset is typically unsuccessful. We employed a VAE to increase the number of sounds in the tiny dataset by synthesizing new sounds from original sounds. The augmented dataset was created by combining these synthesized sounds with the original sounds. We used a high-pass filter with a passband frequency of 1000 Hz and a low-pass filter with a passband frequency of 22 000 Hz to pre-process sounds in the augmented dataset before transforming them to Mel spectrograms. The pre-trained 2D-CNN Alexnet was then trained using these Mel spectrograms. When compared to using the original tiny sound dataset to train pre-trained Alexnet, using the augmented sound dataset enhanced the CNN model's classification results by 6.62%(94.12% when trained on the augmented dataset versus 87.5% when trained on the original dataset). For reproducing and deploying the proposed method, an open-source repository is available at https://gitfront.io/r/user-1913886/MKyfLWwTPm87/Paper5/ 

Place, publisher, year, edition, pages
IEEE conference proceedings, 2022. p. 163-168
Keywords [en]
Alexnet, audio augmentation, machine failure detection, variational autoencoder
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:miun:diva-47576DOI: 10.1109/ISM55400.2022.00036ISI: 000964457800030Scopus ID: 2-s2.0-85147542959ISBN: 9781665471725 (print)OAI: oai:DiVA.org:miun-47576DiVA, id: diva2:1736638
Conference
24th IEEE International Symposium on Multimedia, ISM 2022, 5 December 2022 through 7 December 2022
Available from: 2023-02-14 Created: 2023-02-14 Last updated: 2023-08-31Bibliographically approved
In thesis
1. Enhancing Machine Failure Detection with Artificial Intelligence and sound Analysis
Open this publication in new window or tab >>Enhancing Machine Failure Detection with Artificial Intelligence and sound Analysis
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The detection of damage or abnormal behavior in machines is critical in industry, as it allows faulty components to be detected and repaired as early as possible, reducing downtime and minimizing operating and personnel costs. However, manual detection of machine fault sounds is economically inefficient and labor-intensive. While prior research has identified various methods to detect failures in drill machines using vibration or sound signals, there remain significant challenges. Most previous research in this field has used manual feature extraction and selection, which can be tedious and biased. Recent studies have used LSTM, end-to-end 1D CNN, and 2D CNN as classifiers, but these have limited accuracy for machine failure detection. Additionally, machine failure is rare in the data, and sounds in the real-world dataset have complex waveforms that are a combination of noise and sound.

To address these challenges, this thesis proposes modern artificial intelligence methods for the detection of drill failures using image representations of sound signals (Mel spectrograms and log-Mel spectrograms) and 2-D convolutional neural networks (2D-CNN) for feature extraction. The proposed models use conventional machine learning classifiers (KNN, SVM, and linear discriminant) or a recurrent neural network (long short-term memory) to classify three classes in the dataset (anomalous sounds, normal sounds, and irrelevant sounds). For using conventional machine learning methods as classifiers, pre-trained VGG19 is used to extract features, and neighborhood component analysis (NCA) is used for feature selection. For using LSTM, a small 2D-CNN is proposed to extract features, and an attention layer after LSTM focuses on the anomaly of the sound when the drill changes from normal to the broken state. The findings allow for better anomaly detection in drill machines and the development of a more cost-effective system that can be applied to a small dataset.

Additionally, I also present a case study that advocates for the use of deep learning-based machine failure detection systems. We focus on a small drill sound dataset from Valmet AB, a company that supplies equipment and processes for biofuel production. The dataset consists of 134 sounds that have been categorized as "Anomaly" and "Normal" recorded from a drilling machine. However, using deep learning models for detecting failure drills on such a small sound dataset is typically unsuccessful. To address this problem, we propose using a variational autoencoder (VAE) to augment the small dataset. We generated new sounds by synthesizing them from the original sounds in the dataset using the VAE. The augmented dataset was then pre-processed using a high-pass filter with a passband frequency of 1000 Hz and a low-pass filter with a passband frequency of 22,000 Hz, before being transformed into Mel spectrograms. We trained a pre-trained 2D-CNN Alexnet using these Mel spectrograms. We found that using the augmented dataset enhanced the classification results of the CNN model by 6.62% compared to using the original dataset (94.12% when trained on the augmented dataset versus 87.5% when trained on the original dataset). Our study demonstrates the effectiveness of using a VAE to augment a small sound dataset for training deep learning models for machine failure detection.

Background noise and acoustic noise in sounds can affect the accuracy of the classification system. To improve the sound classification application's accuracy, a sound separation method using short-time Fourier transform (STFT) frames with overlapped content is proposed. Unlike traditional STFT conversion, in which every sound is converted into one image, the signal is split into many STFT frames, improving the accuracy of model prediction by increasing the variability of the data. Images of these frames are separated into clean and noisy ones and subsequently fed into a pre-trained CNN for classification, making the classifier robust to noise. The efficiency of the proposed method is demonstrated using the FSDNoisy18k dataset, where 94.14 percent of 21 classes were classified successfully, including 20 classes of sound events and a noisy class.

Place, publisher, year, edition, pages
Sundsvall: Mid Sweden University, 2023. p. 59
Series
Mid Sweden University doctoral thesis, ISSN 1652-893X ; 395
Keywords
Machine Failure Detection, Machine Learning, Deep Learning, Sound Signal Processing, Audio Augmentation
National Category
Computer Sciences Signal Processing
Identifiers
urn:nbn:se:miun:diva-49212 (URN)978-91-89786-30-1 (ISBN)
Public defence
2023-09-29, C312, Holmgatan 10, Sundsvall, 09:00 (English)
Opponent
Supervisors
Available from: 2023-09-01 Created: 2023-08-30 Last updated: 2023-09-27Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Tran, ThanhBader, SebastianLundgren, Jan

Search in DiVA

By author/editor
Tran, ThanhBader, SebastianLundgren, Jan
By organisation
Department of Electronics Design
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 147 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf