Bayesian Network Structure Learning from Limited Datasets through Graph Evolution

Tonda, Alberto Paolo; Lutton, Evelyne; Reuillon, Romain; Squillero, Giovanni; Wuillemin, Pierre-Henri

doi:10.1007/978-3-642-29139-5_22

Bayesian Network Structure Learning from Limited Datasets through Graph Evolution

Alberto Paolo Tonda²¹,
Evelyne Lutton²²,
Romain Reuillon²¹,
Giovanni Squillero²³ &
…
Pierre-Henri Wuillemin²⁴

Conference paper

1116 Accesses
9 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7244))

Abstract

Bayesian networks are stochastic models, widely adopted to encode knowledge in several fields. One of the most interesting features of a Bayesian network is the possibility of learning its structure from a set of data, and subsequently use the resulting model to perform new predictions. Structure learning for such models is a NP-hard problem, for which the scientific community developed two main approaches: score-and-search metaheuristics, often evolutionary-based, and dependency-analysis deterministic algorithms, based on stochastic tests. State-of-the-art solutions have been presented in both domains, but all methodologies start from the assumption of having access to large sets of learning data available, often numbering thousands of samples. This is not the case for many real-world applications, especially in the food processing and research industry. This paper proposes an evolutionary approach to the Bayesian structure learning problem, specifically tailored for learning sets of limited size. Falling in the category of score-and-search techniques, the methodology exploits an evolutionary algorithm able to work directly on graph structures, previously used for assembly language generation, and a scoring function based on the Akaike Information Criterion, a well-studied metric of stochastic model performance. Experimental results show that the approach is able to outperform a state-of-the-art dependency-analysis algorithm, providing better models for small datasets.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Robinson, R.: Counting unlabeled acyclic digraphs. In: Little, C. (ed.) Combinatorial Mathematics V. Lecture Notes in Mathematics, vol. 622, pp. 28–43. Springer, Heidelberg (1977), doi:10.1007/BFb0069178
Chapter Google Scholar
Chickering, D.M., Geiger, D., Heckerman, D.: Learning bayesian networks is np-hard. Technical Report MSR-TR-94-17, Microsoft Research, Redmond, WA, USA (November 1994)
Google Scholar
Carvalho, A.: A cooperative coevolutionary genetic algorithm for learning bayesian network structures. In: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, GECCO 2011, pp. 1131–1138. ACM, New York (2011)
Chapter Google Scholar
Wong, M.L., Lee, S.Y., Leung, K.S.: Data mining of bayesian networks using cooperative coevolution. Decis. Support Syst. 38, 451–472 (2004)
Article Google Scholar
Barriere, O., Lutton, E., Wuillemin, P.H.: Bayesian network structure learning using cooperative coevolution. In: Genetic and Evolutionary Computation Conference, GECCO 2009 (2009)
Google Scholar
Wong, M.L., Lam, W., Leung, K.S.: Using evolutionary programming and minimum description length principle for data mining of bayesian networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(2), 174–178 (1999)
Article Google Scholar
Fournier, F., Wu, Y., McCall, J., Petrovski, A., Barclay, P.: Application of evolutionary algorithms to learning evolved bayesian network models of rig operations in the gulf of mexico. In: 2010 UK Workshop on Computational Intelligence (UKCI), pp. 1–6 (September 2010)
Google Scholar
Barrière, O., Lutton, E., Baudrit, C., Sicard, M., Pinaud, B., Perrot, N.: Modeling Human Expertise on a Cheese Ripening Industrial Process Using GP. In: Rudolph, G., Jansen, T., Lucas, S., Poloni, C., Beume, N. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 859–868. Springer, Heidelberg (2008)
Chapter Google Scholar
Friedman, N., Linial, M., Nachman, I., Pe’er, D.: Using bayesian networks to analyze expression data. In: Proceedings of the Fourth Annual International Conference on Computational Molecular Biology, RECOMB 2000, pp. 127–135. ACM, New York (2000)
Chapter Google Scholar
Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19(6), 716–723 (1974)
Article MathSciNet MATH Google Scholar
Cheng, J., Bell, D.A., Liu, W.: An algorithm for bayesian belief network construction from data. In: Proceedings of AI & STAT 1997, pp. 83–90 (1997)
Google Scholar
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, 2nd edn. MIT Press Books, vol. 1. The MIT Press (2001)
Google Scholar
Druzdzel, M.J.: SMILE: Structural modeling, inference, and learning engine and GeNIe: A development environment for graphical decision-theoretic models, pp. 902–903. American Association for Artificial Intelligence (1999)
Google Scholar
Sanchez, E., Schillaci, M., Squillero, G.: Evolutionary Optimization: the uGP toolkit. Springer, Heidelberg (2011)
Book Google Scholar
SourceForge: Host of μgp3, http://sourceforge.net/projects/ugp3
Koza, J., Poli, R.: Genetic programming. In: Burke, E.K., Kendall, G. (eds.) Search Methodologies, pp. 127–164. Springer, US (2005), doi:10.1007/0-387-28356-0_5
Google Scholar
Squillero, G.: Microgp - an evolutionary assembly program generator. Genetic Programming and Evolvable Machines 6, 247–263 (2005)
Article Google Scholar
Gandini, S., Ruzzarin, W., Sanchez, E., Squillero, G., Tonda, A.: A framework for automated detection of power-related software errors in industrial verification processes. J. Electron. Test. 26, 689–697 (2010)
Article Google Scholar
Beinlich, I.A., Suermondt, H.J., Chavez, R.M., Cooper, G.F.: The ALARM Monitoring System: A Case Study with Two Probabilistic Inference Techniques for Belief Networks. In: Second European Conference on Artificial Intelligence in Medicine, London, Great Britain, vol. 38, pp. 247–256. Springer, Berlin (1989)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut des Systèmes Complexes, 57-59 rue Lhomond, 75005, Paris, France
Alberto Paolo Tonda & Romain Reuillon
INRIA Saclay-Ile-de-France, AVIZ Team LRI - Bâtiment 650, Université Paris-Sud, 91405, Orsay Cedex, France
Evelyne Lutton
DAUIN, Politecnico di Torino, Corso Duca degli Abruzzi 124, 10129, Torino, Italy
Giovanni Squillero
LIP6 1 Département DÉSIR, 4, place Jussieu, 75005, Paris, France
Pierre-Henri Wuillemin

Authors

Alberto Paolo Tonda
View author publications
You can also search for this author in PubMed Google Scholar
Evelyne Lutton
View author publications
You can also search for this author in PubMed Google Scholar
Romain Reuillon
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Squillero
View author publications
You can also search for this author in PubMed Google Scholar
Pierre-Henri Wuillemin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, University of Birmingham, B15 2TT, Edgbaston, Birmingham, UK
Alberto Moraglio
INESC-ID Lisboa, Rua Alves Redol 9, 1000-029, Lisboa, Portugal
Sara Silva
Institute of Computing Science, Poznań University of Technology, Piotrowo 2, 60-965, Poznań, Poland
Krzysztof Krawiec
Faculty of Sciences and Technology, Department of Informatics Engineering, University of Coimbra, Pólo II - Pinhal de Marrocos, 3030, Coimbra, Portugal
Penousal Machado
ETSI Informática, Universidad de Málaga, Campus de Teatinos, 29071, Málaga, Spain
Carlos Cotta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tonda, A.P., Lutton, E., Reuillon, R., Squillero, G., Wuillemin, PH. (2012). Bayesian Network Structure Learning from Limited Datasets through Graph Evolution. In: Moraglio, A., Silva, S., Krawiec, K., Machado, P., Cotta, C. (eds) Genetic Programming. EuroGP 2012. Lecture Notes in Computer Science, vol 7244. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29139-5_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-29139-5_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29138-8
Online ISBN: 978-3-642-29139-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics