abstract = "The estimation of problem difficulty is an open issue
in Genetic Programming(GP). The goal of this work is to
generate models that predictthe expected performance of
a GP-based classifier when it is applied toan unseen
task. Classification problems are described using
domainspecificfeatures, some of which are proposed in
this work, and thesefeatures are given as input to the
predictive models. These models arereferred to as
predictors of expected performance (PEPs). We
extendthis approach by using an ensemble of specialized
predictors (SPEP),dividing classification problems into
groups and choosing the correspondingSPEP. The proposed
predictors are trained using 2D syntheticclassification
problems with balanced datasets. The models are
thenused to predict the performance of the GP
classifier on unseen realworlddatasets that are
multidimensional and imbalanced. This workis the first
to provide a performance prediction of a GP system on
testdata, while previous works focused on predicting
training performance. Accurate predictive models are
generated by posing a symbolic regressiontask and
solving it with GP. These results are achieved by
usinghighly descriptive features and including a
dimensionality reductionstage that simplifies the
learning and testing process. The proposed
approachcould be extended to other classification
algorithms and usedas the basis of an expert system for
algorithm selection.",
notes = "Supervisor: Leonardo Trujillo Reyes
Also known as \cite{oai:HAL:tel-01668769v1} Also known
as \cite{DBLP:phd/hal/Martinez16a}",