Abstract
This paper explores symbolic regression when there are hundreds of input variables, and the variables have similar influence which means that variable pruning (a priori, or on-the-fly) will be ineffective. For this problem, traditional genetic programming and many other regression approaches do poorly. We develop a technique based on latent variables, nonlinear sensitivity analysis, and genetic programming designed to manage the challenge. The technique handles 340- input variable problems in minutes, with promise to scale well to even higher dimensions. The technique is successfully verified on 24 real-world circuit modeling problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Almal, Arpit A. and al. (2006). Using genetic programming to classify node positive patients in bladder cancer. In Proc. Genetic and Evolutionary Computation Conference, pages 239–246.
Baffi, G., Martin, E.B., and Morris, A.J. (1999). Non-linear projection to latent structures revisited (the neural network pls algorithm). Computers in Chemical Engineering, 23(9).
Becker, Y.L., Fox, H., and Fei, P. (2007). An empirical study of multi-objective algorithms for stock ranking. In Riolo, R.L., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice V, pages 241–262. Springer.
Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.
Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J. (1984). Classification and Regression Trees. Chapman & Hall.
Drennan, P. and McAndrew, C. (1999). A comprehensive mosfet mismatch model. In Proc. International Electron Devices Meeting.
Friedman, J.H. (2002). Stochastic gradient boosting. Journal of Computational Statistics & Data Analysis, 38(4):367–378.
Friedman, J.H. and Popescu, B.E. (2004). Gradient directed regularization for linear regression and classification. Technical report, Stanford University, Department of Statistics.
Friedman, J.H. and Tukey, J.W. (1974). A projection pursuit algorithm for exploratory data analysis. IEEE Trans. Computers, C-23:881.
Hastie, T., Tibshirani, R., and Friedman, J.H. (2001). The Elements of Statistical Learning. Springer.
Jordan, Michael I. and Jacobs, Robert A. (1994). Hierarchical mixtures of experts and the em algorithm. Neural Computation, 6:181–214.
Kordon, A., Castillo, F., Smits, G., and Kotanchek, M. (2005). Application issues of genetic programming in industry. In Yu, T., Riolo, R.L., and Worzel, B., editors, Genetic Programming Theory and Practice III, chapter 16, pages 241–258. Springer.
Kordon, A., Smits, G., Jordaan, E., and Rightor, E. (2002). Robust soft sensors based on integration of genetic programming, analytical neural networks, and support vector machines. In Fogel, D.B. and al., editors, Congress on Evolutionary Computation, pages 896–901. IEEE Press.
Korns, M.F. (2007). Large-scale, time-constrained symbolic regression-classification. In Riolo, R.L., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice V, chapter 4, pages 53–68. Springer.
Koza, John R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA.
Li, X. and Cao, Y. (2008). Projection-based piecewise-linear response surface modeling for strongly nonlinear vlsi performance variations. In IEEE/ACM International Symposium on Quality Electronic Design.
Li, X., Gopalakrishnan, P., Xu, Y., and Pileggi, L. (2007). Robust analog/rf circuit design with projection-based performance modeling. IEEE Trans. Comput.-Aided Design of Integr. Circuits and Systems, 26(1):2–15.
Malthouse, C., Tamhane, A.C., and Mah, R.S.H. (1997). Nonlinear partial least squares. Computers in Chemical Engineering, 21(8).
McConaghy, T. and Gielen, G.G.E. (2006). Canonical form functions as a simple means for genetic programming to evolve human-interpretable functions. In Proc. Genetic and Evolutionary Computation Conference, pages 855–862.
McConaghy, T. and Gielen, G.G.E. (2009). Template-free symbolic performance modeling of analog circuits via canonical form functions and genetic programming. IEEE Trans. Comput.-Aided Design of Integr. Circuits and Systems (to appear).
McConaghy, T., Palmers, P., Gielen, G.G.E., and Steyaert, M. (2008). Automated extraction of expert domain knowledge from genetic programming synthesis results. In Riolo, R.L., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice VI, pages 111–125. Springer.
McKay, B., Willis, M., Searson, D., and Montague, G. (1999). Non-linear continuum regression using genetic programming. In Banzhaf, W. and al., editors, Proc. Genetic and Evol. Comput. Conference, pages 1106–1111.
Moore, J.H., Greene, C.S., Andrews, P.C., and White, B.C. (2008). Does complexity matter? artificial evolution, computational evolution and the genetic analysis of epistasis in common human diseases. In Riolo, R.L., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice VI, Genetic and Evolutionary Computation, chapter 9, pages 125–145. Springer.
Nelder, J.A. and Mead, R. (1965). A simplex method for function minimization. Computer Journal, 7:308–313.
Poggio, T. and Girosi, F. (1990). Networks for approximation and learning. Proc. of the IEEE, 78(9):1481–1497.
Sansen, W. (2006). Analog Design Essentials. Springer.
Schmidt, M.D. and Lipson, H. (2006). Co-evolving fitness predictors for accelerating and reducing evaluations. In Riolo, R.L., Soule, T., and Worzel, B., editors, Genetic Programming Theory and Practice IV, chapter 17. Springer.
Singhee, A. and Rutenbar, R.A. (2007). Beyond low-order statistical response surfaces: Latent variable regression for efficient, highly nonlinear fitting. In Proc. Design Automation Conference.
Smits, G., Kordon, A., Vladislavleva, K., Jordaan, E., and Kotanchek, M. (2005). Variable selection in industrial datasets using pareto genetic programming. In Yu, T., Riolo, R.L., and Worzel, B., editors, Genetic Programming Theory and Practice III, volume 9 of Genetic Programming, pages 79–92. Springer.
Vladislavleva, E. (2008). Model-based Problem Solving through Symbolic Regression via Pareto Genetic Programming. PhD thesis, Tilburg University.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
McConaghy, T. (2010). Latent Variable Symbolic Regression for High-Dimensional Inputs. In: Riolo, R., O'Reilly, UM., McConaghy, T. (eds) Genetic Programming Theory and Practice VII. Genetic and Evolutionary Computation. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-1626-6_7
Download citation
DOI: https://doi.org/10.1007/978-1-4419-1626-6_7
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-1653-2
Online ISBN: 978-1-4419-1626-6
eBook Packages: Computer ScienceComputer Science (R0)