Genetic Programming for Preprocessing Tandem Mass Spectra to Improve the Reliability of Peptide Identification
Created by W.Langdon from
gp-bibliography.bib Revision:1.8129
- @InProceedings{Azari:2018:CEC,
-
author = "Samaneh Azari and Mengjie Zhang and Bing Xue and
Lifeng Peng",
-
title = "Genetic Programming for Preprocessing Tandem Mass
Spectra to Improve the Reliability of Peptide
Identification",
-
booktitle = "2018 IEEE Congress on Evolutionary Computation (CEC)",
-
year = "2018",
-
editor = "Marley Vellasco",
-
address = "Rio de Janeiro, Brazil",
-
month = "8-13 " # jul,
-
publisher = "IEEE",
-
keywords = "genetic algorithms, genetic programming",
-
DOI = "doi:10.1109/CEC.2018.8477810",
-
abstract = "Tandem mass spectrometry (MS/MS) is currently the most
commonly used technology in proteomics for identifying
proteins in complex biological samples. Mass
spectrometers can produce a large number of MS/MS
spectra each of which has hundreds of peaks. These
peaks normally contain background noise, therefore a
preprocessing step to filter the noise peaks can
improve the accuracy and reliability of peptide
identification. This paper proposes to preprocess the
data by classifying peaks as noise peaks or signal
peaks, i.e., a highly-imbalanced binary classification
task, and uses genetic programming (GP) to address this
task. The expectation is to increase the peptide
identification reliability. Meanwhile, six different
types of classification algorithms in addition to GP
are used on various imbalance ratios and evaluated in
terms of the average accuracy and recall. The GP method
appears to be the best in the retention of more signal
peaks as examined on a benchmark dataset containing 1,
674 MS/MS spectra. To further evaluate the
effectiveness of the GP method, the preprocessed
spectral data is submitted to a benchmark de novo
sequencing software, PEAKS, to identify the peptides.
The results show that the proposed method improves the
reliability of peptide identification compared to the
original un-preprocessed data and the intensity-based
thresholding methods.",
-
notes = "WCCI2018",
- }
Genetic Programming entries for
Samaneh Azari
Mengjie Zhang
Bing Xue
Lifeng Peng
Citations