Mid Sweden University

miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Drill Fault Diagnosis Based on the Scalogram and Mel Spectrogram of Sound Signals Using Artificial Intelligence
Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
2020 (English)In: IEEE Access, E-ISSN 2169-3536, Vol. 8, p. 203655-203666Article in journal (Refereed) Published
Abstract [en]

In industry, the ability to detect damage or abnormal functioning in machinery is very important. However, manual detection of machine fault sound is economically inefficient and labor-intensive. Hence, automatic machine fault detection (MFD) plays an important role in reducing operating and personnel costs compared to manual machine fault detection. This research aims to develop a drill fault detection system using state-of-the-art artificial intelligence techniques. Many researchers have applied the traditional approach design for an MFD system, including handcrafted feature extraction of the raw sound signal, feature selection, and conventional classification. However, drill sound fault detection based on conventional machine learning methods using the raw sound signal in the time domain faces a number of challenges. For example, it can be difficult to extract and select good features to input in a classifier, and the accuracy of fault detection may not be sufficient to meet industrial requirements. Hence, we propose a method that uses deep learning architecture to extract rich features from the image representation of sound signals combined with machine learning classifiers to classify drill fault sounds of drilling machines. The proposed methods are trained and evaluated using the real sound dataset provided by the factory. The experiment results show a good classification accuracy of 80.25 percent when using Mel spectrogram and scalogram images. The results promise significant potential for using in the fault diagnosis support system based on the sounds of drilling machines.

Place, publisher, year, edition, pages
2020. Vol. 8, p. 203655-203666
Keywords [en]
Deep learning, machine fault diagnosis, machine learning, sound signal processing
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:miun:diva-40655DOI: 10.1109/ACCESS.2020.3036769ISI: 000590435900001Scopus ID: 2-s2.0-85102866677OAI: oai:DiVA.org:miun-40655DiVA, id: diva2:1506431
Available from: 2020-12-03 Created: 2020-12-03 Last updated: 2023-08-31
In thesis
1. Drill Failure Detection based on Sound using Artificial Intelligence
Open this publication in new window or tab >>Drill Failure Detection based on Sound using Artificial Intelligence
2021 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

In industry, it is crucial to be able to detect damage or abnormal behavior in machines. A machine's downtime can be minimized by detecting and repairing faulty components of the machine as early as possible. It is, however, economically inefficient and labor-intensive to detect machine fault sounds manual. In comparison with manual machine failure detection, automatic failure detection systems can reduce operating and personnel costs.  Although prior research has identified many methods to detect failures in drill machines using vibration or sound signals, this field still remains many challenges. Most previous research using machine learning techniques has been based on features that are extracted manually from the raw sound signals and classified using conventional classifiers (SVM, Gaussian mixture model, etc.). However, manual extraction and selection of features may be tedious for researchers, and their choices may be biased because it is difficult to identify which features are good and contain an essential description of sounds for classification. Recent studies have used LSTM, end-to-end 1D CNN, and 2D CNN as classifiers for classification, but these have limited accuracy for machine failure detection. Besides, machine failure occurs very rarely in the data. Moreover, the sounds in the real-world dataset have complex waveforms and usually are a combination of noise and sound presented at the same time.

Given that drill failure detection is essential to apply in the industry to detect failures in machines, I felt compelled to propose a system that can detect anomalies in the drill machine effectively, especially for a small dataset. This thesis proposed modern artificial intelligence methods for the detection of drill failures using drill sounds provided by Valmet AB. Instead of using raw sound signals, the image representations of sound signals (Mel spectrograms and log-Mel spectrograms) were used as the input of my proposed models. For feature extraction, I proposed using deep learning 2-D convolutional neural networks (2D-CNN) to extract features from image representations of sound signals. To classify three classes in the dataset from Valmet AB (anomalous sounds, normal sounds, and irrelevant sounds), I proposed either using conventional machine learning classifiers (KNN, SVM, and linear discriminant) or a recurrent neural network (long short-term memory). For using conventional machine learning methods as classifiers, pre-trained VGG19 was used to extract features and neighborhood component analysis (NCA) as the feature selection. For using long short-term memory (LSTM), a small 2D-CNN was proposed to extract features and used an attention layer after LSTM to focus on the anomaly of the sound when the drill changes from normal to the broken state. Thus, my findings will allow readers to detect anomalies in drill machines better and develop a more cost-effective system that can be conducted well on a small dataset.

There is always background noise and acoustic noise in sounds, which affect the accuracy of the classification system. My hypothesis was that noise suppression methods would improve the sound classification application's accuracy. The result of my research is a sound separation method using short-time Fourier transform (STFT) frames with overlapped content. Unlike traditional STFT conversion, in which every sound is converted into one image, a different approach is taken. In contrast, splitting the signal into many STFT frames can improve the accuracy of model prediction by increasing the variability of the data. Images of these frames separated into clean and noisy ones are saved as images, and subsequently fed into a pre-trained CNN for classification. This enables the classifier to become robust to noise. The FSDNoisy18k dataset is chosen in order to demonstrate the efficiency of the proposed method. In experiments using the proposed approach, 94.14 percent of 21 classes were classified successfully, including 20 classes of sound events and a noisy class.

Place, publisher, year, edition, pages
Sundsvall, Sweden: Mid Sweden University, 2021. p. 46
Series
Mid Sweden University licentiate thesis, ISSN 1652-8948 ; 188
Keywords
Convolutional neural network, machine failure detection, Mel-spectrogram, long short-term memory, sound signal processing
National Category
Other Computer and Information Science Computer Sciences
Identifiers
urn:nbn:se:miun:diva-43841 (URN)978-91-89341-37-1 (ISBN)
Presentation
2021-12-16, C312, Holmgatan 10, Sundsvall, 13:00 (English)
Opponent
Supervisors
Projects
AISound – Akustisk sensoruppsättning för AI-övervakningssystemMiLo — miljön i kontrolloopen
Note

Vid tidpunkten för disputationen var följande delarbeten opublicerade: delarbete 2 och 3 inskickat.

At the time of the doctoral defence the following papers were unpublished: paper 2 and 3 submitted.

Available from: 2021-11-25 Created: 2021-11-24 Last updated: 2021-11-25Bibliographically approved
2. Enhancing Machine Failure Detection with Artificial Intelligence and sound Analysis
Open this publication in new window or tab >>Enhancing Machine Failure Detection with Artificial Intelligence and sound Analysis
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The detection of damage or abnormal behavior in machines is critical in industry, as it allows faulty components to be detected and repaired as early as possible, reducing downtime and minimizing operating and personnel costs. However, manual detection of machine fault sounds is economically inefficient and labor-intensive. While prior research has identified various methods to detect failures in drill machines using vibration or sound signals, there remain significant challenges. Most previous research in this field has used manual feature extraction and selection, which can be tedious and biased. Recent studies have used LSTM, end-to-end 1D CNN, and 2D CNN as classifiers, but these have limited accuracy for machine failure detection. Additionally, machine failure is rare in the data, and sounds in the real-world dataset have complex waveforms that are a combination of noise and sound.

To address these challenges, this thesis proposes modern artificial intelligence methods for the detection of drill failures using image representations of sound signals (Mel spectrograms and log-Mel spectrograms) and 2-D convolutional neural networks (2D-CNN) for feature extraction. The proposed models use conventional machine learning classifiers (KNN, SVM, and linear discriminant) or a recurrent neural network (long short-term memory) to classify three classes in the dataset (anomalous sounds, normal sounds, and irrelevant sounds). For using conventional machine learning methods as classifiers, pre-trained VGG19 is used to extract features, and neighborhood component analysis (NCA) is used for feature selection. For using LSTM, a small 2D-CNN is proposed to extract features, and an attention layer after LSTM focuses on the anomaly of the sound when the drill changes from normal to the broken state. The findings allow for better anomaly detection in drill machines and the development of a more cost-effective system that can be applied to a small dataset.

Additionally, I also present a case study that advocates for the use of deep learning-based machine failure detection systems. We focus on a small drill sound dataset from Valmet AB, a company that supplies equipment and processes for biofuel production. The dataset consists of 134 sounds that have been categorized as "Anomaly" and "Normal" recorded from a drilling machine. However, using deep learning models for detecting failure drills on such a small sound dataset is typically unsuccessful. To address this problem, we propose using a variational autoencoder (VAE) to augment the small dataset. We generated new sounds by synthesizing them from the original sounds in the dataset using the VAE. The augmented dataset was then pre-processed using a high-pass filter with a passband frequency of 1000 Hz and a low-pass filter with a passband frequency of 22,000 Hz, before being transformed into Mel spectrograms. We trained a pre-trained 2D-CNN Alexnet using these Mel spectrograms. We found that using the augmented dataset enhanced the classification results of the CNN model by 6.62% compared to using the original dataset (94.12% when trained on the augmented dataset versus 87.5% when trained on the original dataset). Our study demonstrates the effectiveness of using a VAE to augment a small sound dataset for training deep learning models for machine failure detection.

Background noise and acoustic noise in sounds can affect the accuracy of the classification system. To improve the sound classification application's accuracy, a sound separation method using short-time Fourier transform (STFT) frames with overlapped content is proposed. Unlike traditional STFT conversion, in which every sound is converted into one image, the signal is split into many STFT frames, improving the accuracy of model prediction by increasing the variability of the data. Images of these frames are separated into clean and noisy ones and subsequently fed into a pre-trained CNN for classification, making the classifier robust to noise. The efficiency of the proposed method is demonstrated using the FSDNoisy18k dataset, where 94.14 percent of 21 classes were classified successfully, including 20 classes of sound events and a noisy class.

Place, publisher, year, edition, pages
Sundsvall: Mid Sweden University, 2023. p. 59
Series
Mid Sweden University doctoral thesis, ISSN 1652-893X ; 395
Keywords
Machine Failure Detection, Machine Learning, Deep Learning, Sound Signal Processing, Audio Augmentation
National Category
Computer Sciences Signal Processing
Identifiers
urn:nbn:se:miun:diva-49212 (URN)978-91-89786-30-1 (ISBN)
Public defence
2023-09-29, C312, Holmgatan 10, Sundsvall, 09:00 (English)
Opponent
Supervisors
Available from: 2023-09-01 Created: 2023-08-30 Last updated: 2023-09-27Bibliographically approved

Open Access in DiVA

fulltext(1681 kB)1359 downloads
File information
File name FULLTEXT01.pdfFile size 1681 kBChecksum SHA-512
3ea700ec1a95105d4ff3334c7f831a1ffe8ec7e00d00310f28c0d57b9c4a10c08993b0f0eb817a9138c4be6f56cbd354ee0b3df9f99b06ee5908afb64ea9fb8e
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Tran, ThanhLundgren, Jan

Search in DiVA

By author/editor
Tran, ThanhLundgren, Jan
By organisation
Department of Electronics Design
In the same journal
IEEE Access
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 1367 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 485 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf