abstract = "As data science becomes more mainstream, there will be
an ever-growing demand for data science tools that are
more accessible, flexible, and scalable. In response to
this demand, automated machine learning (AutoML)
researchers have begun building systems that automate
the process of designing and optimizing machine
learning pipelines. we present TPOT, an open source
genetic programming-based AutoML system that optimizes
a series of feature preprocessors and machine learning
models with the goal of maximizing classification
accuracy on a supervised classification task. We
benchmark TPOT on a series of 150 supervised
classification tasks and find that it significantly
outperforms a basic machine learning analysis in 22 of
them, while experiencing minimal degradation in
accuracy on 5 of the benchmarks, all without any domain
knowledge nor human input. As such, GP-based AutoML
systems show considerable promise in the AutoML
domain.",