Abstract
Particle-based modeling of materials at atomic scale plays an important role in the development of new materials and the understanding of their properties. The accuracy of particle simulations is determined by interatomic potentials, which allow calculating the potential energy of an atomic system as a function of atomic coordinates and potentially other properties. First-principles-based ab initio potentials can reach arbitrary levels of accuracy, however, their applicability is limited by their high computational cost. Machine learning (ML) has recently emerged as an effective way to offset the high computational costs of ab initio atomic potentials by replacing expensive models with highly efficient surrogates trained on electronic structure data. Among a plethora of current methods, symbolic regression (SR) is gaining traction as a powerful “white-box” approach for discovering functional forms of interatomic potentials. This contribution discusses the role of symbolic regression in Materials Science (MS) and offers a comprehensive overview of current methodological challenges and state-of-the-art results. A genetic programming-based approach for modeling atomic potentials from raw data (consisting of snapshots of atomic positions and associated potential energy) is presented and empirically validated on ab initio electronic structure data.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Agrawal, A., Choudhary, A.: Perspective: Materials informatics and big data: realization of the “fourth paradigm” of science in materials science. APL Mater. 4(5), 053208 (2016)
Araújo, J.P., Ballester, M.Y.: A comparative review of 50 analytical representation of potential energy interaction for diatomic systems: 100 years of history. Int. J. Quantum Chem. 121(24), e26808 (2021)
Baker, J.E.: Reducing bias and inefficiency in the selection algorithm. In: Proceedings of the Second International Conference on Genetic Algorithms on Genetic Algorithms and Their Application, pp. 14–21, L. Erlbaum Associates Inc., USA (1987)
Balabin, R.M., Lomakina, E.I.: Support vector machine regression (ls-svm)–an alternative to artificial neural networks (anns) for the analysis of quantum chemistry data? Phys. Chem. Chem. Phys. 13, 11710–11718 (2011)
Bartók, A.P., Kondor, R., Csányi, G.: On representing chemical environments. Phys. Rev. B 87, 184115 (2013)
Behler, J.: Perspective: Machine learning potentials for atomistic simulations. J. Chem. Phys. 145(17), 170901 (2016)
Bellucci, M.A., Coker, D.F.: Empirical valence bond models for reactive potential energy surfaces: A parallel multilevel genetic program approach. J. Chem. Phys. 135(4), 044115 (2011)
Bellucci, M.A., Coker, D.F.: Molecular dynamics of excited state intramolecular proton transfer: 3-hydroxyflavone in solution. J. Chem. Phys. 136(19), 194505 (2012)
Binder, K., Heermann, D., Roelofs, L., John Mallinckrodt, A., McKay, S.: Monte carlo simulation in statistical physics. Comput. Phys. 7(2), 156–157 (1993)
Brown, A., McCoy, A.B., Braams, B.J., Jin, Z., Bowman, J.M.: Quantum and classical studies of vibrational motion of ch5+ on a global potential energy surface obtained from a novel ab initio direct dynamics approach. J. Chem. Phys. 121(9), 4105–4116 (2004)
Brown, M.W., Thompson, A.P., Watson, J.-P., Schultz, P.A.: Bridging scales from ab initio models to predictive empirical models for complex materials. Technical report, Laboratories, Sandia National (2008)
Brown, W.M., Thompson, A.P., Schultz, P.A.: Efficient hybrid evolutionary optimization of interatomic potential models. J. Chem. Phys. 132(2), 024108 (2010)
Burlacu, B., Kronberger, G., Kommenda, M.: Operon C++: an efficient genetic programming framework for symbolic regression. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, GECCO’20, pp. 1562–1570. Association for Computing Machinery (2020). (internet, 8–12 July 2020)
La Cava, W.G., Orzechowski, P., Burlacu, B., de França, F.O., Virgolin, M., Jin, Y., Kommenda, M., Moore, J.H.: Contemporary symbolic regression methods and their relative performance (2021). CoRR, arXiv:2107.14351
Chen, R., Shao, K., Fu, B., Zhang, D.H.: Fitting potential energy surfaces with fundamental invariant neural network. ii. generating fundamental invariants for molecular systems with up to ten atoms. J. Chem. Phys. 152(20), 204307 (2020)
Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Dral, P.O.: Quantum chemistry in the age of machine learning. J. Phys. Chem. Lett. 11(6), 2336–2347 (2020). PMID: 32125858
Eldridge, A., Rodriguez, A., Hu, M., Hu, J.: Genetic programming-based learning of carbon interatomic potential for materials discovery (2022)
Gagné, C., Parizeau, M.: Genericity in evolutionary computation software tools: Principles and case study. Int. J. Artif. Intell. Tools 15(2), 173–194 (2006)
Gao, H., Wang, J., Sun, J.: Improve the performance of machine-learning potentials by optimizing descriptors. J. Chem. Phys. 150(24), 244110 (2019)
Ghiringhelli, L.M., Vybiral, J., Levchenko, S.V., Draxl, C., Scheffler, M.: Big data of materials science: Critical role of the descriptor. Phys. Rev. Lett. 114, 105503 (2015)
Guennebaud, G., Jacob, B., et al.: Eigen v3 (2010). http://eigen.tuxfamily.org
Handley, C.M., Behler, J.: Next generation interatomic potentials for condensed systems. Eur. Phys. J. B 87(7), 152 (2014)
Hernandez, A., Balasubramanian, A., Yuan, F., Mason, S.A.M., Mueller, T.: Fast, accurate, and transferable many-body interatomic potentials by symbolic regression. NPJ Comput. Mater. 5(1), 112 (2019)
Hey, T., Butler, K., Jackson, S., Thiyagalingam, J.: Machine learning and big scientific data. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 378(2166), 20190054 (2020)
Himanen, L., Geurts, A., Foster, A.S., Rinke, P.: Data-driven materials science: status, challenges, and perspectives. Adv. Sci. 6(21), 1900808 (2019)
Hospital, A., Goñi, J.R., Orozco, M., Gelpí, J.L.: Molecular dynamics simulations: advances and applications. Adv. Appl. Bioinform. Chem. AABC 8, 37 (2015)
Hu, J., Goodman, E., Seo, K., Fan, Z., Rosenberg, R.: The hierarchical fair competition (hfc) framework for sustainable evolutionary algorithms. Evol. Comput. 13(2), 241–277 (06 2005)
Ischtwan, J., Collins, M.A.: Molecular potential energy surfaces by interpolation. J. Chem. Phys. 100(11), 8080–8088 (1994)
Kenoufi, A., Kholmurodov, K.: Symbolic regression of interatomic potentials via genetic programming. Biol. Chem. Res 2, 1–10 (2015)
Kim, C., Pilania, G., Ramprasad, R.: From organized high-throughput data to phenomenological theory using machine learning: the example of dielectric breakdown. Chem. Mater. 28(5), 1304–1311 (2016)
Kim, K.H., Lee, Y.S., Ishida, T., Jeung, G.-H.: Dynamics calculations for the lih+h li+h2 reactions using interpolations of accurate ab initio potential energy surfaces. J. Chem. Phys. 119(9), 4689–4693 (2003)
Kohn, W., Sham, L.J.: Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133–A1138 (1965)
Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA (1992)
Kresse, G., Furthmüller, J.: Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996)
Kruskal, W.H., Allen Wallis, W.: Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 47(260), 583–621 (1952)
Kusne, A., Mueller, T., Ramprasad, R.: Machine learning in materials science: recent progress and emerging applications. Rev. Comput. Chem. (2016). (2016-05-06)
Makarov, D.E., Metiu, H.: Fitting potential-energy surfaces: a search in the function space by directed genetic programming. J. Chem. Phys. 108(2), 590–598 (1998)
Makarov, D.E., Metiu, H.: Using genetic programming to solve the schrödinger equation. J. Phys. Chem. A 104(37), 8540–8545 (2000)
Mueller, T., Hernandez, A., Wang, C.: Machine learning for interatomic potential models. J. Chem. Phys. 152(5), 050902 (2020)
Mueller, T., Johlin, E., Grossman, J.C.: Origins of hole traps in hydrogenated nanocrystalline and amorphous silicon revealed through machine learning. Phys. Rev. B 89, 115202 (2014)
Pilania, G.: Machine learning in materials science: From explainable predictions to autonomous design. Comput. Mater. Sci. 193, 110360 (2021)
Plimpton, S.: Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117(1), 1–19 (1995)
Rothe, T., Schuster, J., Teichert, F., Lorenz, E.E.: Machine Learning Potentials-State of the Research and Potential Applications for Carbon Nanostructures. Technische Universität, Faculty of Natural Sciences, Institute of Physics (2019)
Sastry, K.N.: Genetic algorithms and genetic programming for multiscale modeling: Applications in materials science and chemistry and advances in scalability. PhD thesis, University of Illinois, Urbana-Champaign (March 2007)
Shao, K., Chen, J., Zhao, Z., Zhang, D.H.: Communication: fitting potential energy surfaces with fundamental invariant neural network. J. Chem. Phys. 145(7), 071101 (2016)
Shapeev, A.V.: Moment tensor potentials: a class of systematically improvable interatomic potentials. Multiscale Model. Simul. 14(3), 1153–1173 (2016)
Slepoy, A., Peters, M.D., Thompson, A.P.: Searching for globally optimal functional forms for interatomic potentials using genetic programming with parallel tempering. J. Comput. Chem. 28(15), 2465–2471 (2007)
Steele, D., Lippincott, E.R., Vanderslice, J.T.: Comparative study of empirical internuclear potential functions. Rev. Mod. Phys. 34, 239–251 (1962)
Stillinger, F.H., Weber, T.A.: Computer simulation of local order in condensed phases of silicon. Phys. Rev. B 31, 5262–5271 (1985)
Sutton, A.P., Chen, J.: Long-range finnis-sinclair potentials. Philos. Mag. Lett. 61(3), 139–146 (1990)
Thompson, A.P., Swiler, L.P., Trott, C.R., Foiles, S.M., Tucker, G.J.: Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials. J. Comput. Phys. 285, 316–330 (2015)
Unke, O.T., Chmiela, S., Sauceda, H.E., Gastegger, M., Poltavsky, I., Schütt, K.T., Tkatchenko, A., Müller, K.-R.: Machine learning force fields. Chem. Rev. 0(0):null. PMID: 33705118 (2021)
Wang, Y., Wagner, N., Rondinelli, J.M.: Symbolic regression in materials science. MRS Commun. 9(3), 793–805 (2019)
Zhang, L., Han, J., Wang, H., Car, R., Weinan, E.: Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 143001 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
5 Appendix
5 Appendix
Empirical potentials
For a comprehensive overview of empirical potentials, we recommend the work of Araújo and Ballester [2]. Below, we give a casual overview of the most important empirical potentials mentioned in this contribution.
Morse potential
This is an empirical potential used to model diatomic molecules:
where D is the dissociation energy, r is the distance between atoms, a is a set of parameters and \(r_0\) is the equilibrium bond distance.
Lennard-Jones potential
The Lennard-Jones potential models soft repulsive and attractive interactions and can describe electronically neutral atoms or molecules. Interacting particles repel each other at very close distances, attract each other at moderate distances, and do not interact at infinite distances:
where r is the distance between atoms, \(\varepsilon \) is the dispersion energy and \(\sigma \) is the distance at which the particle-particle potential energy V is zero.
Lippincott potential
Lippincott [49] potential involves an exponential of interatomic distances
where D is the dissociation energy, r is the distance between atoms, \(r_0\) is the equilibrium bond distance and a and n are parameters. F(r) is a function of internuclear distance such that \(F(r) = 0\) when \(r=\infty \) and \(F(r) = \infty \) when \(r=0\).
Stillinger-Weber potential
The Stillinger-Weber potential [50] models two- and three-body interactions by taking into account not only the distances between atoms but also the bond angles:
where
Sutton-Chen potential
The Sutton-Chen potential [51] has been used in molecular dynamics and Monte Carlo simulations of metallic systems. It offers a reasonable description of various bulk properties, with an approximate many-body representation of the delocalized metallic bonding:
Here, the first term represents the repulsion between atomic cores, and the second term models the bonding energy due to the electrons. Both terms are further defined in terms of reciprocal power so that the complete expression is
where C is a dimensionless parameter, \(\epsilon \) is a parameter with dimensions of energy, a is the lattice constant, m, n are positive integers with \(n > m\) and \(r_{ij}\) is the distance between the ith and jth atoms.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Burlacu, B., Kommenda, M., Kronberger, G., Winkler, S.M., Affenzeller, M. (2023). Symbolic Regression in Materials Science: Discovering Interatomic Potentials from Data. In: Trujillo, L., Winkler, S.M., Silva, S., Banzhaf, W. (eds) Genetic Programming Theory and Practice XIX. Genetic and Evolutionary Computation. Springer, Singapore. https://doi.org/10.1007/978-981-19-8460-0_1
Download citation
DOI: https://doi.org/10.1007/978-981-19-8460-0_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8459-4
Online ISBN: 978-981-19-8460-0
eBook Packages: Computer ScienceComputer Science (R0)