skip to main content
10.1145/3377929.3398139acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article
Public Access

GEVO-ML: a proposal for optimizing ML code with evolutionary computation

Published:08 July 2020Publication History

ABSTRACT

Parallel accelerators, such as GPUs, are a key enabler of large-scale Machine Learning (ML) applications. However, programmers often lack detailed knowledge of the underlying architecture and fail to fully leverage their computational power. This paper proposes GEVO-ML, a tool for automatically discovering optimization opportunities and tuning the performance of ML kernels. GEVO-ML extends earlier work on GEVO (Gpu optimization using EVOlutionary computation) by focusing directly on ML frameworks, intermediate languages, and target architectures. It retains the multi-objective evolutionary search developed for GEVO, which searches for edits to GPU code compiled to LLVM-IR and improves performance on desired criteria while retaining required functionality. In earlier work, we studied some ML workloads in GPU settings and found that GEVO could improve kernel speeds by factors ranging from 1.7X to 2.9X, even with access to only a small portion of the overall ML framework. This workshop paper examines the limitations and constraints of GEVO for ML workloads and discusses our GEVO-ML design, which we are currently implementing.

References

  1. 2018. XLA is a compiler that optimizes TensorFlow computations. https://www.tensorflow.org/xla/. (2018).Google ScholarGoogle Scholar
  2. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-scale Machine Learning. In Proc. of the 12th USENIX Conf. on Operating Systems Design and Implementation.Google ScholarGoogle Scholar
  3. P Anju. 2018. Tips to Improve Performance for Popular Deep Learning Frameworks on CPUs. Intel Developer Zone (2018).Google ScholarGoogle Scholar
  4. Arash Ashari, Shirish Tatikonda, Matthias Boehm, Berthold Reinwald, Keith Campbell, John Keenleyside, and P. Sadayappan. 2015. On Optimizing Machine Learning Workloads via Kernel Fusion. In Proceedings of the 20th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPoPP 2015). Association for Computing Machinery, New York, NY, USA, 173--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Benoit Baudry, Simon Allier, Marcelino Rodriguez-Cancio, and Martin Monperrus. 2015. Automatic software diversity in the light of test suites. arXiv preprint arXiv:1509.00144 (2015).Google ScholarGoogle Scholar
  6. Irwan Bello, Hieu Pham, Quoc V Le, Mohammad Norouzi, and Samy Bengio. 2016. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 (2016).Google ScholarGoogle Scholar
  7. James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, Feb (2012), 281--305.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bobby R. Bruce, Justyna Petke, and Mark Harman. 2015. Reducing Energy Consumption Using Genetic Improvement. In Proc. of the 17th Annual Conf. on Genetic and Evolutionary Computation.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Forbes J Burkowski. 1999. Shuffle crossover and mutual information. In Proc. of the 1999 Congress on Evolutionary Computation-CEC99.Google ScholarGoogle ScholarCross RefCross Ref
  10. Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. (2011).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015).Google ScholarGoogle Scholar
  12. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et al. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In Proc. of 13th {USENIX} Symp. on Operating Systems Design and Implementation.Google ScholarGoogle Scholar
  13. François-Michel De Rainville, Félix-Antoine Fortin, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. DEAP: A Python Framework for Evolutionary Algorithms. In Proc. of the 14th Annual Conf. Companion on Genetic and Evolutionary Computation.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kalyanmoy Deb, Samir Agrawal, Amrit Pratap, and Tanaka Meyarivan. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation (2002).Google ScholarGoogle Scholar
  15. Vidroha Debroy and W Eric Wong. 2010. Using Mutation to Automatically Suggest Fixes for Faulty Programs. In Proc. of 3rd Intl. Conf. on Software Testing, Verification and Validation.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. (2017). http://archive.ics.uci.edu/mlGoogle ScholarGoogle Scholar
  17. Facebook. 2018. Finding and Fixing Software Bugs Automatically With Sapfix and Sapienz. https://code.fb.com/developer-tools/finding-and-fixing-software-bugs-automatically-with-sapfix-and-sapienz/. (2018).Google ScholarGoogle Scholar
  18. Facebook. 2019. Caffe2. (2019). https://caffe2.ai/.Google ScholarGoogle Scholar
  19. Stephanie Forrest, ThanhVu Nguyen, Westley Weimer, and Claire Le Goues. 2009. A Genetic Programming Approach to Automated Software Repair. In Proc. of the 11th Annual Conf. on Genetic and Evolutionary Computation.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Google. 2019. TensorFlow Performance Guide. https://docs.w3cub.com/tensorflow~guide/performance/performance_guide/#general_best_practices. (2019). TensorFlow Documentation.Google ScholarGoogle Scholar
  21. C. Le Goues, T. Nguyen, S. Forrest, and W. Weimer. 2012. GenProg: A Generic Method for Automatic Software Repair. IEEE Transactions on Software Engineering (2012).Google ScholarGoogle Scholar
  22. Saemundur O. Haraldsson, John R. Woodward, Alexander, E.I. Brownlee, A. V. Smith, and V. Gudnason. 2017. Genetic improvement of runtime and its fitness landscape in a bioinformatics application. In Proc. of the Genetic and Evolutionary Computation Conf. Companion.Google ScholarGoogle Scholar
  23. Saemundur O. Haraldsson, John R. Woodward, Alexander E. I. Brownlee, and Kristin Siggeirsdottir. 2017. Fixing Bugs in Your Sleep: How Genetic Improvement Became an Overnight Success. In Proc. of the Genetic and Evolutionary Computation Conf. Companion.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, et al. 2018. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. In Proc. of IEEE Intl. Symp. on High Performance Computer Architecture.Google ScholarGoogle ScholarCross RefCross Ref
  25. Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, et al. 2003. A practical guide to support vector classification. (2003).Google ScholarGoogle Scholar
  26. Jonathan Huang, Vivek Rathod, Chen Sun, Menglong Zhu, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, et al. 2017. Speed/accuracy trade-offs for modern convolutional object detectors. In Proc. of the IEEE Conf. on computer vision and pattern recognition.Google ScholarGoogle ScholarCross RefCross Ref
  27. Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia, and Alex Aiken. 2019. TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions. In Proc. of the 27th ACM Symp. on Operating Systems Principles (SOSP '19).Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Norman P Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, et al. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture. 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Alex Krizhevsky. 2009. Learning multiple layers of features from tiny images. Technical Report.Google ScholarGoogle Scholar
  30. William B Langdon and Mark Harman. 2010. Evolving a CUDA kernel from an nVidia template. In Proc. of IEEE Congress on Evolutionary Computation.Google ScholarGoogle ScholarCross RefCross Ref
  31. William B. Langdon, Brian Yee Hong Lam, Justyna Petke, and Mark Harman. 2015. Improving CUDA DNA Analysis Software with Genetic Programming. In Proc. of the 17th Annual Conf. on Genetic and Evolutionary Computation.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio. 2007. An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation. In Proc. of the 24th Intl. Conf. on Machine Learning.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Chris Lattner and Jacques Pienaar. 2019. MLIR Primer: A Compiler Infrastructure for the End of Moore's Law. (2019).Google ScholarGoogle Scholar
  34. Yann Le Cun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. of the IEEE (1998).Google ScholarGoogle Scholar
  35. Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A Systematic Study of Automated Program Repair: Fixing 55 out of 105 Bugs for $8 Each. In Proc. of the 34th Int. Conf. on Software Engineering.Google ScholarGoogle ScholarCross RefCross Ref
  36. C.-Y. Lee and E. K. Antonsson. 2000. Variable Length Genomes for Evolutionary Algorithms. In Proc. of 2nd Annual Conf. on the Genetic and Evolutionary Computation Conf.Google ScholarGoogle Scholar
  37. Jhe-Yu Liou, Stephanie Forrest, and Carole-Jean Wu. 2019. Genetic Improvement of GPU Code. In Proc. of the 6th Intl. Workshop on Genetic Improvement (GI '19). Best paper award.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Jhe-Yu Liou, Stephanie Forrest, and Carole-Jean Wu. 2019. Uncovering Performance Opportunities by Relaxing Program Semantics of GPGPU Kernels. Wild and Crazy Idea session at the 24th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems. (2019).Google ScholarGoogle Scholar
  39. Jhe-Yu Liou, Xiaodong Wang, Stephanie Forrest, and Carole-Jean Wu. 2020. GEVO: GPU Code Optimization using EvolutionaryComputation. (2020). arXiv:cs.NE/2004.08140Google ScholarGoogle Scholar
  40. Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).Google ScholarGoogle Scholar
  41. LLVM. 2020. Multi-Level IR Compiler Framework. (2020). https://mlir.llvm.org/.Google ScholarGoogle Scholar
  42. Lech Madeyski, Wojciech Orzeszyna, Richard Torkar, and Mariusz Jozala. 2014. Overcoming the Equivalent Mutant Problem: A Systematic Literature Review and a Comparative Experiment of Second Order Mutation. IEEE Transactions on Software Engineering (2014).Google ScholarGoogle Scholar
  43. Henry Massalin. 1987. Superoptimizer: A Look at the Smallest Program. In Proc. of the 2nd Intl. Conf. on Architectual Support for Programming Languages and Operating Systems.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2016. Pruning Convolutional Neural Networks for Resource Efficient Inference. In Proc. of Intl. Conf. on Learning Representations.Google ScholarGoogle Scholar
  45. David J Montana and Lawrence Davis. 1989. Training Feedforward Neural Networks Using Genetic Algorithms.. In IJCAI.Google ScholarGoogle Scholar
  46. Gregory Morse and Kenneth O. Stanley. 2016. Simple Evolutionary Optimization Can Rival Stochastic Gradient Descent in Neural Networks. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO '16). Association for Computing Machinery, New York, NY, USA, 477--484. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017).Google ScholarGoogle Scholar
  48. John C. Platt. 1999. Advances in Kernel Methods. Chapter Fast Training of Support Vector Machines Using Sequential Minimal Optimization.Google ScholarGoogle Scholar
  49. Qualcomm. 2016. Snapdragon Neural Processing Engine SDK. (2016). https://developer.qualcomm.com/docs/snpe/overview.html.Google ScholarGoogle Scholar
  50. Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. ACM SIGPLAN Notices (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. 2019. Regularized evolution for image classifier architecture search. In Proc. of the AAAI Conf. on Artificial Intelligence, Vol. 33. 4780--4789.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V Le, and Alexey Kurakin. 2017. Large-scale evolution of image classifiers. In Proc. of the 34th Intl. Conf. on Machine Learning-Volume 70. JMLR. org.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. 2011. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in neural information processing systems.Google ScholarGoogle Scholar
  54. Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Summer Deng, Roman Dzhabarov, James Hegeman, Roman Levenstein, Bert Maher, Satish Nadathur, Jakob Olesen, et al. 2018. Glow: Graph Lowering Compiler Techniques for Neural Networks. arXiv preprint arXiv:1805.00907 (2018).Google ScholarGoogle Scholar
  55. David Saad. 1998. Online algorithms and stochastic approximations. Online Learning 5 (1998), 6--3.Google ScholarGoogle Scholar
  56. Eric Schulte. 2014. Neutral Networks of Real-World Programs and their Application to Automated Software Evolution. Ph.D. Dissertation. University of New Mexico, Albuquerque, USA.Google ScholarGoogle Scholar
  57. Eric Schulte, Jonathan DiLorenzo, Stephanie Forrest, and Westley Weimer. 2013. Automated Repair of Binary and Assembly Programs for Cooperating Embedded Devices. In Proc. of Intl. Conf. on Architectural Support for Programming Languages and Operating Systems.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Eric Schulte, Jonathan Dorn, Stephen Harding, Stephanie Forrest, and Westley Weimer. 2014. Post-compiler Software Optimization for Reducing Energy. In Proc. of the 19th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Eric Schulte, Zachary P. Fry, Ethan Fast, Westley Weimer, and Stephanie Forrest. 2014. Software Mutational Robustness. Genetic Programming and Evolvable Machines (2014).Google ScholarGoogle Scholar
  60. Eric M Schulte, Westley Weimer, and Stephanie Forrest. 2015. Repairing COTS Router Firmware without Access to Source Code or Test Suites: A Case Study in Evolutionary Software Repair. In Proc. of the 1st Genetic Improvement Workshop.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Pitchaya Sitthi-Amorn, Nicholas Modly, Westley Weimer, and Jason Lawrence. 2011. Genetic Programming for Shader Simplification. In Proc. of the 2011 SIGGRAPH Asia Conf.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Leslie N Smith. 2017. Cyclical learning rates for training neural networks. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 464--472.Google ScholarGoogle ScholarCross RefCross Ref
  63. Kenneth O Stanley, David B D'Ambrosio, and Jason Gauci. 2009. A hypercubebased encoding for evolving large-scale neural networks. Artificial life (2009).Google ScholarGoogle Scholar
  64. Kenneth O. Stanley and Risto Miikkulainen. 2002. Evolving Neural Networks through Augmenting Topologies. Evolutionary Computation 10, 2 (2002), 99--127.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. D Stathakis. 2009. How many hidden layers and nodes? International Journal of Remote Sensing 30, 8 (2009), 2133--2147.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Chris Thornton, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2013. Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms. In Proc. of the 19th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (KDD '13).Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Nadarajen Veerapen, Fabio Daolio, and Gabriela Ochoa. 2017. Modelling genetic improvement landscapes with local optima networks. In Proc. of the Genetic and Evolutionary Computation Conf. Companion.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Phillip Verbancsics and Kenneth O Stanley. 2011. Constraining connectivity to encourage modularity in HyperNEAT. In Proc. of the 13th annual Conf. on Genetic and evolutionary computation. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Yu Emma Wang, Carole-Jean Wu, Xiaodong Wang, Kim Hazelwood, and David Brooks. 2019. Exploiting Parallelism Opportunities with Deep Learning Frameworks. arXiv preprint arXiv:1908.04705 (2019).Google ScholarGoogle Scholar
  70. Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically Finding Patches Using Genetic Programming. In Proc. of the 31st Intl. Conf. on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Zeyi Wen, Jiashuai Shi, Qinbin Li, Bingsheng He, and Jian Chen. 2018. ThunderSVM: A Fast SVM Library on GPUs and CPUs. Journal of Machine Learning Research (2018).Google ScholarGoogle Scholar
  72. Carole-Jean Wu, D. Brooks, K. Chen, D. Chen, S. Choudhury, M. Dukhan, K. Hazelwood, E. Isaac, Y. Jia, B. Jia, T. Leyvand, H. Lu, Y. Lu, L. Qiao, B. Reagen, J. Spisak, F. Sun, A. Tulloch, P. Vajda, X. Wang, Y. Wang, B. Wasti, Y. Wu, R. Xian, S. Yoo, and P. Zhang. 2019. Machine Learning at Facebook: Understanding Inference at the Edge. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). 331--344.Google ScholarGoogle Scholar
  73. Sixin Zhang, Anna E Choromanska, and Yann LeCun. 2015. Deep learning with elastic averaging SGD. In Advances in neural information processing systems. 685--693.Google ScholarGoogle Scholar
  74. Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).Google ScholarGoogle Scholar

Index Terms

  1. GEVO-ML: a proposal for optimizing ML code with evolutionary computation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion
        July 2020
        1982 pages
        ISBN:9781450371278
        DOI:10.1145/3377929

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 July 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,669of4,410submissions,38%

        Upcoming Conference

        GECCO '24
        Genetic and Evolutionary Computation Conference
        July 14 - 18, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader