Tapped Delay Lines for GP Streaming Data Classification with Label Budgets
Created by W.Langdon from
gp-bibliography.bib Revision:1.8010
- @InProceedings{Vahdat:2015:EuroGP,
-
author = "Ali Vahdat and Jillian Morgan and
Andrew R. McIntyre and Malcolm I. Heywood and A. Nur Zincir-Heywood",
-
title = "Tapped Delay Lines for {GP} Streaming Data
Classification with Label Budgets",
-
booktitle = "18th European Conference on Genetic Programming",
-
year = "2015",
-
editor = "Penousal Machado and Malcolm I. Heywood and
James McDermott and Mauro Castelli and
Pablo Garcia-Sanchez and Paolo Burelli and Sebastian Risi and Kevin Sim",
-
series = "LNCS",
-
volume = "9025",
-
publisher = "Springer",
-
pages = "126--138",
-
address = "Copenhagen",
-
month = "8-10 " # apr,
-
organisation = "EvoStar",
-
keywords = "genetic algorithms, genetic programming, Streaming
data classification, Non-stationary, Class imbalance,
Benchmarking",
-
isbn13 = "978-3-319-16500-4",
-
DOI = "doi:10.1007/978-3-319-16501-1_11",
-
abstract = "Streaming data classification requires that a model be
available for classifying stream content while
simultaneously detecting and reacting to changes to the
underlying process generating the data. Given that only
a fraction of the stream is visible at any point in
time (i.e. some form of window interface) then it is
difficult to place any guarantee on a classifier
encountering a well mixed distribution of classes
across the stream. Moreover, streaming data classifiers
are also required to operate under a limited label
budget (labelling all the data is too expensive). We
take these requirements to motivate the use of an
active learning strategy for decoupling genetic
programming training epochs from stream throughput. The
content of a data subset is controlled by a combination
of Pareto archiving and stochastic sampling. In
addition, a significant benefit is attributed to
support for a tapped delay line (TDL) interface to the
stream, but this also increases the dimensionality of
the task. We demonstrate that the benefits of assuming
the TDL can be maintained through the use of
oversampling without recourse to additional label
information. Benchmarking on 4 dataset demonstrates
that the approach is particularly effective when
reacting to shifts in the underlying properties of the
stream. Moreover, an online formulation for class-wise
detection rate is assumed, where this is able to
robustly characterise classifier performance throughout
the stream.",
-
notes = "Part of \cite{Machado:2015:GP} EuroGP'2015 held in
conjunction with EvoCOP2015, EvoMusArt2015 and
EvoApplications2015",
- }
Genetic Programming entries for
Ali Vahdat
Jillian Morgan
Andrew R McIntyre
Malcolm Heywood
Nur Zincir-Heywood
Citations