Genetically programmed-based artificial features extraction applied to fault detection

https://doi.org/10.1016/j.engappai.2007.06.004Get rights and content

Abstract

This paper presents a novel application of genetically programmed artificial features, which are computer crafted, data driven, and possibly without physical interpretation, to the problem of fault detection. Artificial features are extracted from vibration data of an accelerometer sensor to monitor and detect a crack fault or incipient failure seeded in an intermediate gearbox of a helicopter's main transmission. Classification accuracies for the artificial feature constructed from raw data exceeded 99% over training and independent validation sets. As a benchmark, GP-based artificial features constructed from conventional ones underperformed those derived from raw data by over 2% over the training and over 11% over the testing data.

Introduction

Mechanical systems are exposed to potentially detrimental fatigue stresses and vibrations as a result of operating for extended periods of time. These can results in faults leading to failures in the system and thus causing downtime, severe damage to the machine, or even catastrophic failures. For these reasons, condition monitoring has gained interest and there is a need for methodologies and algorithms that can access the health condition of the system and warn of impending or incipient failures. To achieve this, several different sensors are typically employed to measure and record information on the system status. Such data are analyzed using a pattern recognition system, which consists of preprocessing, feature extraction, and classification, where it is decided whether a crack fault, for instance, occurred or not. Other diagnostic and prognostic procedures can be further invoked to determine which component is at fault, and what its remaining useful life is. In this work, we specifically deal with the fault detection problem.

Arguably the most important part of this process is feature extraction since this is the gateway to the information of interest. A series of measurements and transformations (e.g., time domain, frequency domain, phase-space domain, statistical measures, etc.) are carried out in order to obtain relevant information from raw data and discard the component that is irrelevant (noise) for discrimination purposes. One problem confronting the analytic is to determine which feature or features should be selected or extracted. This is due in part to the lack of knowledge and complexity issues of the problem at hand. Fig. 1 shows a three-dimensional schematic diagram illustrating what feature selection/extraction approach should be implemented depending on the knowledge-complexity tradeoff and the scrutiny and optimization levels (in a data-relevancy sense) of several methodologies for feature extraction, with the lightest-gray one denoting the lowest in an optimality scale.

Evaluating and studying the degree of relevancy of each conventional feature—for instance, as in Grabill et al. (2001), Hardman et al. (2000), Samuel and Pines, 2001, Samuel and Pines, 2003—can be a time-consuming, tedious task. Also, researchers are subjected to bias when dealing with new problems by naturally using previous experiences and perspectives from other research topics. Usually, once a set of features is designed for a specific problem, the same set is employed in similar problems. A way to address the drawbacks expounded above is to calculate and evaluate many conventional features, and then use a classical feature extraction method to reduce the dimensionality and decide which features are the most informative ones. Classical feature extraction methods include principal component analysis, discriminant analysis, and branch and bound (BB), among others (Duda et al., 2001).

In recent years, other more sophisticated extraction methods have been proposed, driven by the increasing computation performance of microprocessors, involving the use of evolutionary computation techniques. They include the use of genetic algorithms (GA), particle swarm optimization (PSO), and genetic programming (GP), among others. All these algorithms typically rely on a population of solutions, a fitness metric, and genetic operators to perform a search for the optimal feature or set of features. In Firpi and Goodman (2004), Siedlecki and Sklansky (1989), Punch et al. (1993), the authors presented feature extraction methods that use GA and PSO paradigms to select a set of conventional features. The limitation with these approaches is that they just seek the optimal set of features within the pool of conventional features given to the algorithms. That is, they do not evolve features to highlight certain patterns but merely select a subset or a scaled version of them to improve the performance—in this case, fault detection. These approaches have worked for many problems with good results. As mentioned before, however, when the complexity of the problem increases considerably, the aforementioned approaches may fall short of the “best” calculation. A more data-driven approach was proposed in Chang et al. (1990), Guo et al. (2005), Kotani et al. (1997), Sherrah et al. (1996), where the authors used a GP algorithm to build features from conventional ones that optimize a performance metric, by creating either linear or nonlinear transformation, that have better performance than the conventional features themselves. Nonetheless, the new transformations are based on traditional features that may discard particular signatures embedded in the raw data given that such features were not designed for a specific problem but in a general framework.

The objective of this paper is to propose yet another new perspective for a GP algorithm as a feature extractor applicable to fault detection. In Section 2, we provide a discussion of why artificial features and discuss briefly review previous works that used GP to create features; we include a brief explanation of GP and its application to fault detection. In Section 3, results are presented, followed by a discussion and conclusions in 4 Discussion, 5 Conclusions, respectively.

Section snippets

Previous work in GP-based features for fault detection

The use of GPs as a tool to create/extract features has been proposed in the past. In Kotani et al. (1997), the authors introduced a method to create features using a GP algorithm for a compressor fault diagnosis employing acoustic signals. Acoustic signals (raw data) were recorded, processed through an FFT and logarithm power spectra to derive conventional features, which, in turn, were provided to the GP algorithm. In Guo et al. (2005), also used a GP module to design features from

Results

Section 2.2 describes the data used for the experiments. Training and validation data were normalized using a z-score normalization method. Holdout validation was used for the experiments, that is, the first half of points of each epoch were used for training and the other half for validation. There is a low risk of overfitting the vibration data given that sudden dynamic changes are not expected in a typical mechanical system. Therefore, we have approximately 88.56 s (or 4.533×106 point

Discussion

The above results show that using a GP algorithm to look for information directly from vibration (raw) data can reveal an underlying pattern that can assist to determine when a fault (i.e., a crack) occurred in a mechanical component—in this case, a gear pinion. Fig. 10(a) shows that, while the gear is in a healthy condition, the mean of the signal is approximately 2600±181.35 (mean±std.); however, as soon as the crack is seeded (at point sample 510 in the plot), the artificial feature, i.e.,

Conclusions

In this work, we presented a different perspective to the use a of GP algorithm to extract/create artificial features from raw data and to seek for underlying patterns that may not be captured by the mathematical structure of the conventional features. The GP-based artificial features (GPAFs) meet all the attributes of what a good set of features should be, listed by Kil and Shin (1996), but one: they may have an unknown physical meaning. Although it might seem a problem, however, by relaxing

References (21)

  • W. Siedlecki et al.

    A note on genetic algorithms for large scale feature selection

    Pattern Recognition Letters

    (1989)
  • W. Banzhaf et al.

    Genetic Programming: An Introduction

    (1998)
  • E. Chang et al.

    Using genetic algorithms to select and create features for pattern classification

    Proceeding of the IEEE International Joint Conference on Neural Networks

    (1990)
  • R. Duda et al.

    Pattern Classification

    (2001)
  • Firpi, H., 2005. On prediction and detection of epileptic seizures by means of genetic programming artificial features....
  • Firpi, H., Goodman, E., 2004. Swarm feature selection. In: Proceedings of the of 33rd Applied Imagery Pattern...
  • Firpi, H., Goodman, E., Echauz, J., 2005. On prediction of epileptic seizures by computing multiple genetic programming...
  • Fischer, S., Klinkenberg, R., Mierswa, I., Ritthoff, O., 2002. Yale: Yet Another Learning Environment––Tutorial,...
  • Grabill, P., Berry, J., Grant, L., Porter, J., 2001. Automated helicopter vibration diagnostics for the US Army and...
  • H. Guo et al.

    Feature generation using genetic programming with application to fault classification

    IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics

    (2005)
There are more references available in the full text version of this article.

Cited by (24)

  • A novel strain-based health indicator for the remaining useful life estimation of degrading composite structures

    2023, Composite Structures
    Citation Excerpt :

    ΗΙGA will be presented in detail in section 4.1 and is only shown here for comparison purposes. Genetic algorithms (GAs) were selected over other optimization methods since they have been previously used successfully for prognostics feature extraction in condition monitoring of machinery and systems [48–51]. GAs provide great flexibility in discovering new mathematical equations, can easily account for multiple evaluation criteria and are able to accomplish accurate results without the need for understanding the underlying physics of the structure under investigation [39].

  • Gear fault models and dynamics-based modelling for gear fault detection – A review

    2020, Engineering Failure Analysis
    Citation Excerpt :

    The statistical features can then be extracted for residual evaluation, and their trends can be checked for fault diagnosis. More sophisticated methods have been developed, and in the literature, different techniques have been applied for gear fault diagnosis, such as Bayesian networks [194], neural networks [195,196], fuzzy logic [197,198], neural-fuzzy inference [199], genetic algorithms [195,196,200], support vector machines [196,201,202], multivariate statistics [203], and the weighted K nearest neighbour classification algorithm [204]. The research literature published during the past few decades in the field of gear fault detection with an emphasis on gear fault modelling and dynamic simulation has been summarised in this article.

  • A new feature extraction and selection scheme for hybrid fault diagnosis of gearbox

    2011, Expert Systems with Applications
    Citation Excerpt :

    Consequently, there is a need for a reliable, fast and automated procedure of diagnostics. Various intelligent techniques such as artificial neural networks (ANN), support vector machine (SVM), fuzzy logic and evolving algorithms (EA) have been successfully applied to automated detection and diagnosis of machine conditions (Firpi & Vachtsevanos, 2008; Lei, He, Zi, & Hu, 2007; Lei & Zuo, 2009; Samanta, 2004; Samanta, Al-Balushi, & Al-Araimi, 2003; Samanta & Nataraj, 2009; Srinivasan, Cheu, Poh, & Ng, 2000; Wuxing, Tse, Guicai, & Tielin, 2004). They have largely improved the reliability and automation of fault diagnosis systems for gearbox.

  • A novel method for real time gear fault detection based on pulse shape analysis

    2011, Mechanical Systems and Signal Processing
    Citation Excerpt :

    Also artificially intelligent techniques like neural networks, genetic algorithms, genetic programming, with the combinations of wavelet analysis for features extraction, have been successfully used for fault detection and diagnosis of gears and gearboxes [19–21,28]. In time domain features extraction, kurtosis and spectral kurtosis based [27,29,30], statistical based and transient based features detection studies have been performed in the past [22–24]. A comprehensive list of time domain features and fault detection based on them is also discussed in [25].

View all citing articles on Scopus
1

Tel.: +1 404 894 6252; fax: +1 404 894 4130.

View full text