research-article

Choose your programming copilot: a comparison of the program synthesis performance of github copilot and genetic programming

Authors:
Dominik Sobania

Johannes Gutenberg University, Mainz, Germany

Johannes Gutenberg University, Mainz, Germany
View Profile

,
Martin Briesch

Johannes Gutenberg University, Mainz, Germany

Johannes Gutenberg University, Mainz, Germany
View Profile

,
Franz Rothlauf

Johannes Gutenberg University, Mainz, Germany

Johannes Gutenberg University, Mainz, Germany
View Profile

GECCO '22: Proceedings of the Genetic and Evolutionary Computation ConferenceJuly 2022Pages 1019–1027https://doi.org/10.1145/3512290.3528700

Published:08 July 2022Publication History

GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference

Pages 1019–1027

ABSTRACT

GitHub Copilot, an extension for the Visual Studio Code development environment powered by the large-scale language model Codex, makes automatic program synthesis available for software developers. This model has been extensively studied in the field of deep learning, however, a comparison to genetic programming, which is also known for its performance in automatic program synthesis, has not yet been carried out. In this paper, we evaluate GitHub Copilot on standard program synthesis benchmark problems and compare the achieved results with those from the genetic programming literature. In addition, we discuss the performance of both approaches. We find that the performance of the two approaches on the benchmark problems is quite similar, however, in comparison to GitHub Copilot, the program synthesis approaches based on genetic programming are not yet mature enough to support programmers in practical software development. Genetic programming usually needs a huge amount of expensive hand-labeled training cases and takes too much time to generate solutions. Furthermore, source code generated by genetic programming approaches is often bloated and difficult to understand. For future work on program synthesis with genetic programming, we suggest researchers to focus on improving the execution time, readability, and usability.

References

Francisco Baeta, João Correia, Tiago Martins, and Penousal Machado. 2021. TensorGP-Genetic Programming Engine in TensorFlow.. In EvoApplications. Springer, 763--778.Google Scholar
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.0473Google Scholar
Markus F Brameier and Wolfgang Banzhaf. 2007. Linear genetic programming. Springer Science & Business Media.Google Scholar
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877--1901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdfGoogle Scholar
Nicholas Carlini, Florian Tramèr, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Úlfar Erlingsson, Alina Oprea, and Colin Raffel. 2021. Extracting Training Data from Large Language Models. In 30th USENIX Security Symposium (USENIX Security 21). USENIX Association, 2633--2650. https://www.usenix.org/conference/usenixsecurity21/presentation/carlini-extractingGoogle Scholar
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).Google Scholar
Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1724--1734.Google ScholarCross Ref
Colin Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, and Neel Sundaresan. 2020. PyMT5: Multi-mode Translation of Natural Language and Python Code with Transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 9052--9065.Google ScholarCross Ref
Nichael Lynn Cramer. 1985. A representation for the adaptive generation of simple sequential programs. In proceedings of an International Conference on Genetic Algorithms and the Applications. 183--187.Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT. 4171--4186.Google Scholar
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, et al. 2020. CodeBERT: A PreTrained Model for Programming and Natural Languages. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings. 1536--1547.Google ScholarCross Ref
Stefan Forstenlechner, David Fagan, Miguel Nicolau, and Michael O'Neill. 2017. A grammar design pattern for arbitrary program synthesis problems in genetic programming. In European Conference on Genetic Programming. Springer, 262--277.Google ScholarCross Ref
Stefan Forstenlechner, David Fagan, Miguel Nicolau, and Michael O'Neill. 2018. Extending program synthesis grammars for grammar-guided genetic programming. In International Conference on Parallel Problem Solving from Nature. Springer, 197--208.Google ScholarCross Ref
Sumit Gulwani. 2010. Dimensions in program synthesis. In Proceedings of the 12th international ACM SIGPLAN symposium on Principles and practice of declarative programming. 13--24.Google ScholarDigital Library
Thomas Helmuth and Amr Abdelhady. 2020. Benchmarking parent selection for program synthesis by genetic programming. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion. 237--238.Google ScholarDigital Library
Thomas Helmuth and Peter Kelly. 2021. PSB2: the second program synthesis benchmark suite. In Proceedings of the Genetic and Evolutionary Computation Conference. 785--794.Google ScholarDigital Library
Thomas Helmuth, Nicholas Freitag McPhee, Edward Pantridge, and Lee Spector. 2017. Improving generalization of evolved programs through automatic simplification. In Proceedings of the Genetic and Evolutionary Computation Conference. 937--944.Google ScholarDigital Library
Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2018. Program synthesis using uniform mutation by addition and deletion. In Proceedings of the Genetic and Evolutionary Computation Conference. 1127--1134.Google ScholarDigital Library
Thomas Helmuth and Lee Spector. 2015. General program synthesis benchmark suite. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation. 1039--1046.Google ScholarDigital Library
Thomas Helmuth and Lee Spector. 2021. Problem-solving benefits of down-sampled lexicase selection. arXiv preprint arXiv:2106.06085 (2021).Google Scholar
Erik Hemberg, Jonathan Kelly, and Una-May O'Reilly. 2019. On domain knowledge and novelty to improve program synthesis performance with grammatical evolution. In Proceedings of the Genetic and Evolutionary Computation Conference. 1039--1046.Google ScholarDigital Library
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.Google ScholarDigital Library
Jeremy Howard and Sebastian Ruder. 2018. Universal Language Model Fine-tuning for Text Classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 328--339.Google ScholarCross Ref
John R Koza. 1992. Genetic programming: on the programming of computers by means of natural selection. Vol. 1. MIT press.Google ScholarDigital Library
Alexander Lalejini and Charles Ofria. 2019. Tag-accessed memory for genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. 346--347.Google ScholarDigital Library
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1412--1421.Google ScholarCross Ref
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. (2018).Google Scholar
Md Omar Faruk Rokon, Risul Islam, Ahmad Darki, Evangelos E. Papalexakis, and Michalis Faloutsos. 2020. SourceFinder: Finding Malware Source-Code from Publicly Available Repositories in GitHub. In 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020). USENIX Association, San Sebastian, 149--163. https://www.usenix.org/conference/raid2020/presentation/omarGoogle Scholar
Dominik Sobania. 2021. On the generalizability of programs synthesized by grammar-guided genetic programming. In European Conference on Genetic Programming (Part of EvoStar). Springer, 130--145.Google ScholarDigital Library
Dominik Sobania and Franz Rothlauf. 2019. Teaching GP to program like a human software developer: using perplexity pressure to guide program synthesis approaches. In Proceedings of the Genetic and Evolutionary Computation Conference. 1065--1074.Google ScholarDigital Library
Dominik Sobania and Franz Rothlauf. 2020. Challenges of program synthesis with grammatical evolution. In European Conference on Genetic Programming (Part of EvoStar). Springer, 211--227.Google ScholarDigital Library
Dominik Sobania and Franz Rothlauf. 2021. A generalizability measure for program synthesis with genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference. 822--829.Google ScholarDigital Library
Dominik Sobania, Dirk Schweim, and Franz Rothlauf. 2022. A Comprehensive Survey on Program Synthesis with Evolutionary Algorithms. IEEE Transactions on Evolutionary Computation (2022).Google ScholarCross Ref
Lee Spector, Jon Klein, and Maarten Keijzer. 2005. The push3 execution stack and the evolution of control. In Proceedings of the 7th annual conference on Genetic and evolutionary computation. 1689--1696.Google ScholarDigital Library
Lee Spector and Alan Robinson. 2002. Genetic programming and autoconstructive evolution with the push programming language. Genetic Programming and Evolvable Machines 3, 1 (2002), 7--40.Google ScholarDigital Library
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.Google Scholar

Index Terms

Choose your programming copilot: a comparison of the program synthesis performance of github copilot and genetic programming
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Software and its engineering
  1. Software creation and management
    1. Search-based software engineering
    2. Software development techniques
      1. Automatic programming
        Genetic programming

Recommendations

An empirical evaluation of GitHub copilot's code suggestions
MSR '22: Proceedings of the 19th International Conference on Mining Software Repositories

GitHub and OpenAI recently launched Copilot, an "AI pair programmer" that utilizes the power of Natural Language Processing, Static Analysis, Code Synthesis, and Artificial Intelligence. Given a natural language description of the target functionality, ...
Read More
Counterexample-driven genetic programming
GECCO '17: Proceedings of the Genetic and Evolutionary Computation Conference

Genetic programming is an effective technique for inductive synthesis of programs from training examples of desired input-output behavior (tests). Programs synthesized in this way are not guaranteed to generalize beyond the training set, which is ...
Read More
Introduction to genetic programming tutorial: from the basics to human-competitive results
GECCO '10: Proceedings of the 12th annual conference companion on Genetic and evolutionary computation

The tutorial will start with a description of the problem addressed by genetic programming, a description of the basic genetic programming algorithm, and examples of applications. The tutorial will also describe advanced topics, such as use of a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference
July 2022
1472 pages
ISBN:9781450392372
DOI:10.1145/3512290
Editor:
Jonathan E. Fieldsend
University of Exeter
,
General Chair:
Markus Wagner
The University of Adelaide
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 July 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
GitHub copilot
codex
genetic programming
large-scale language models
program synthesis
software engineering
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,669of4,410submissions,38%
Upcoming Conference
GECCO '24

Sponsor:

sigevo

Genetic and Evolutionary Computation Conference

July 14 - 18, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 37
  Total Citations
  View Citations
- 1,134
  Total Downloads
- Downloads (Last 12 months)545
- Downloads (Last 6 weeks)61
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Choose your programming copilot: a comparison of the program synthesis performance of github copilot and genetic programming

GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

An empirical evaluation of GitHub copilot's code suggestions

Counterexample-driven genetic programming

Introduction to genetic programming tutorial: from the basics to human-competitive results