1. THE COMPLETE TITLE OF ONE (OR MORE) PAPER(S) PUBLISHED IN THE OPEN LITERATURE DESCRIBING THE WORK THAT THE AUTHOR CLAIMS DESCRIBES A HUMAN-COMPETITIVE RESULT; Paper-1: “Towards Incorporating Human Knowledge in Fuzzy Pattern Tree Evolution” (Best paper award EuroGP '21). Paper-2: “Grammar-based Fuzzy Pattern Trees for Classification Problems” (Best Student paper award ECTA '20). ------------------------------------------------------------------------------- 2. THE NAME, COMPLETE PHYSICAL MAILING ADDRESS, E-MAIL ADDRESS, AND PHONE NUMBER OF EACH AUTHOR OF EACH PAPER(S); Aidan Murphy, Limerick, Ireland. Department of Computer Science and Information Systems, University of Limerick, Postgraduate Lab CSG-028a, Ext. 284. Aidan.Murphy@ul.ie Muhammad Sarmad Ali, Limerick, Ireland. Department of Computer Science and Information Systems, University of Limerick, Postgraduate Lab CSG-028a, Ext. 284. Sarmad.Ali@lero.ie Gráinne Murphy, Limerick, Ireland. Department of Computer Science and Information Systems, University of Limerick, Postgraduate Lab CSG-028a, Ext. 284. grainnem.murphy2@gmail.com Jorge L. M. Amaral, Rua São Francisco Xavier, 524. Maracanã. 5o andar- Bloco E - Sala 5001-E, State University of Rio de Janeiro, Brasil. Zip code : 20559-900. jamaral@eng.uerj.br, jamaral@uerj.br Douglas Mota Dias, Department of Computer Science and Information Systems, University of Limerick, Postgraduate Lab CSG-028a, Ext. 284. Douglas.Motadias@ul.ie Enrique Naredo, Department of Computer Science and Information Systems, University of Limerick, Postgraduate Lab CSG-028a, Ext. 284. Enrique.Naredo@ul.ie Conor Ryan, Limerick, Ireland. Department of Computer Science and Information Systems, University of Limerick, CS1-015, Conor.Ryan@ul.ie ------------------------------------------------------------------------------- 3. THE NAME OF THE CORRESPONDING AUTHOR (I.E., THE AUTHOR TO WHOM NOTICES WILL BE SENT CONCERNING THE COMPETITION); The corresponding author is Aidan Murphy. ------------------------------------------------------------------------------- 4. THE ABSTRACT OF THE PAPER(S); Paper-1: This paper shows empirically that Fuzzy Pattern Trees (FPT) evolved using Grammatical Evolution (GE), a system we call FGE, meets the criteria to be considered a robust Explainable Artificial Intelligence (XAI) system. Experimental results show FGE achieves competitive results against state-of-the-art black box methods on a set of real-world benchmark problems. Various selection methods were investigated to see which was best for finding smaller, more interpretable models and a human expert was recruited to test the interpretability of the models found and to give a confidence score for each model. Models which were deemed interpretable but not trustworthy by the expert were seen to be outperformed in classification accuracy by interpretable models which were judge trustworthy, validating that FGE can be a powerful XAI technique. Paper-2: This paper introduces a novel approach to induce Fuzzy Pattern Trees (FPT) using Grammatical Evolution (GE), FGE, and applies to a set of benchmark classification problems. While conventionally a set of FPTs are needed for classifiers, one for each class, FGE needs just a single tree. This is the case for both binary and multi-classification problems. Experimental results show that FGE achieves competitive and frequently better results against state-of-the-art FPT related methods, such as FPTs evolved using Cartesian Genetic Programming (FCGP), on a set of benchmark problems. While FCGP produces smaller trees, FGE reaches a better classification performance. FGE also benefits from a reduction in the number of necessary user selectable parameters. Furthermore, in order to tackle bloat or solutions growing too large, another version of FGE using parsimony pressure was tested. The experimental results show that FGE with this addition is able to produce smaller trees than those using FCGP, frequently without compromising the classification performance. ------------------------------------------------------------------------------- 5. A LIST CONTAINING ONE OR MORE OF THE EIGHT LETTERS (A, B, C, D, E, F, G, OR H) THAT CORRESPOND TO THE CRITERIA (SEE ABOVE) THAT THE AUTHOR CLAIMS THAT THE WORK SATISFIES; Our result presented for consideration satisfies the following criteria: (D) The result is publishable in its own right as a new scientific result independent of the fact that the result was mechanically created. (G) The result solves a problem of indisputable difficulty in its field. ------------------------------------------------------------------------------- 6. A STATEMENT STATING WHY THE RESULT SATISFIES THE CRITERIA THAT THE CONTESTANT CLAIMS (SEE EXAMPLES OF STATEMENTS OF HUMAN-COMPETITIVENESS AS A GUIDE TO AID IN CONSTRUCTING THIS PART OF THE SUBMISSION); (D) The result is publishable in its own right as a new scientific result independent of the fact that the result was mechanically created. Our approach was able to incorporate human insight and improve the overall performance of classifiers by ~2% on test data by removing individuals with poor 'logic'. Human-in-the-loop machine learning has shown promise in many fields of AI. It has mostly been used to help with the curation of training datasets. A human may look at the output scores or the final results of a model and identify that a particular group/class/feature is not being handled correctly or that some other overfitting has taken place; they can then adjust/augment the training data to mitigate this. All this human expertise, however, cannot be directly incorporated into the model (i.e. in the weights of a NN, SVM boundaries). While some work has been done creating collaborative systems in other domains, to our knowledge there has been no such work in evolving GE/GP classifiers. Our system, FGE, was empirically shown to be able improve its performance by querying an expert. If an individual was deemed to be 'illogical' or overfit by the expert, it was removed from the population. This means individuals will not simply be rated on how well they generalise on unseen benchmark data but also how well their logic generalises, as judged by the expert. This pressure will result in the expert's fingerprints being all over the population. This leverages the best of both approaches: the machine's ability to take in vast amounts of data and the expert's ability to abstract from it. This is only possible by creating a model which can communicate its results in a clear and expressive manner to an expert, which is what we have achieved. (G) The result solves a problem of indisputable difficulty in its field. Artificial Intelligence continues to invade our lives at every turn. Much of it uses some form of black box paradigm. This means the logic/soundness of these models goes unchecked as there are two key stumbling blocks to achieve this: the opacity of the final models and the lack of knowledge of AI systems a typical domain expert would have. In most domains, particularly critical systems, but also in more mundane tasks, it is not sufficient for AI to simply outperform humans on some benchmark test set. The model must satisfy the users that it will not fail catastrophically or have some blind spots due to insufficient training data. To achieve this, the model and expert must work much like a team of humans do in the real world. This requires mutual collaboration, clear communication of ideas and, from the point of view of the expert, sense checking. Our approach, FGE, achieves this. The expert was able to first build trust in the system by only picking those models that used understandable logic. To make this decision required no ML/EC skills at all as the data was presented using human comprehensible terms ('High Heart Rate', 'Low Blood Pressure', etc.) in a graphical tree form, which was easy to follow. The process to determine the logic of the model took between several seconds to a couple of minutes, depending on the complexity of the model and the number of variables present, illustrating the ease at which they can be interpreted. Next, they removed the models which, having understood their logic, they deemed to be incorrect. This is not to say that they have no predictive or diagnostic power, which can be seen by their identical training performance, but that they either overstate, understate or ignore critical pieces of information which an expert uses to make the final determination. This would make the expert sceptical of their performance 'in the wild'.. Despite having nearly identical training accuracy, the models deemed as incorrect were seen to have worse performance on the test data by roughly 2%. All of this is accomplished without the user needing any knowledge of GE, grammars, or EC in general. This is in contrast to other such methods, such as adding customised function of encapsulating subtrees which would require some expertise in EC and, in our case, grammars. ------------------------------------------------------------------------------- 7. A FULL CITATION OF THE PAPER (THAT IS, AUTHOR NAMES; PUBLICATION DATE; NAME OF JOURNAL, CONFERENCE, TECHNICAL REPORT, THESIS, BOOK, OR BOOK CHAPTER; NAME OF EDITORS, IF APPLICABLE, OF THE JOURNAL OR EDITED BOOK; PUBLISHER NAME; PUBLISHER CITY; PAGE NUMBERS, IF APPLICABLE); Paper-1: Aidan Murphy, Gráinne Murphy, Jorge L. M. Amaral, Douglas Mota Dias, Enrique Naredo, Conor Ryan: Towards Incorporating Human Knowledge in Fuzzy Pattern Tree Evolution. In Proceedings of EuroGP 2021: Proceedings of the 24th European Conference on Genetic Programming, 2021, Editor: Ting Hu, Nuno Lourenco, Eric Medvet, Series LNCS, volume 12691, Publisher: Springer Verlag, Virtual Event, pages 66--81, Month 7-9, ISBN:13 978-3-030-72811-3, DOI: 10.1007/978-3-030-72812-0_5. Best paper award. Paper-2: Aidan Murphy, Muhammad Sarmad Ali, Jorge L. M. Amaral, Douglas Mota Dias, Enrique Naredo, Conor Ryan: Grammar-based Fuzzy Pattern Trees for Classification Problems. In Proceedings of the 12th International Joint Conference on Computational Intelligence - ECTA, ISBN 978-989-758-475-6; ISSN 2184-2825, pages 71-80. DOI: 10.5220/0010111900710080. Best paper award. ------------------------------------------------------------------------------- 8. A STATEMENT EITHER THAT "ANY PRIZE MONEY, IF ANY, IS TO BE DIVIDED EQUALLY AMONG THE CO-AUTHORS" OR A SPECIFIC PERCENTAGE BREAKDOWN AS TO HOW THE PRIZE MONEY, IF ANY, IS TO BE DIVIDED AMONG THE CO-AUTHORS; Any prize money, if any, will be divided equally among the co-authors. ------------------------------------------------------------------------------- 9. A STATEMENT STATING WHY THE AUTHORS EXPECT THAT THEIR ENTRY WOULD BE THE "BEST", While it has been a long standing challenge for machines to outperform humans, for the most serious and critical of tasks this is not enough. Even as AI/ML algorithms and systems minimise errors due to the extremely large resources available to them in the form of massively parallel supercomputers and millions of training examples, AI systems have continuously shown to exhibit racist, sexist and other odious (and oftentimes illegal) behaviours. Human oversight and sense checking is the only solution to this problem. The development of AI which can express itself and effectively communicate with the expert user is therefore of the utmost importance. We have developed a system which can do exactly that. It both improves the performance of the classifiers found while at the same time building trust in the system. We feel this is the first the step in a direction the whole EC community needs to consider, if not turn completely, in future. An added benefit for EC is the potential benefit this would have in biasing the search to 'logical' parts of the search space. This would make such a system quite powerful on home computer/GPUs and would not require having access to expensive supercomputers where vast search spaces may be investigated fruitlessly. Stuck in the shadow of Deep Learning systems for quite some time now, FGE can help elevate GP and other Evolutionary Algorithm methods by building in understandability. Our approach is general enough to be applied to virtually any classification problem, virtually straight out of the box. In summary, our system 1) Can be understood by humans; 2) Builds trust in its predictions; 3) Can incorporate the human feedback into the evolutionary search; 4) Can identify when a model may hit a black spot and fail; 5) Allows human to help guide the search with no expertise in EC; 6) Is an opportunity for GP and related methods to distinguish themselves from black box methods that produce opaque results. ------------------------------------------------------------------------------- 10. AN INDICATION OF THE GENERAL TYPE OF GENETIC OR EVOLUTIONARY COMPUTATION USED, SUCH AS GA (GENETIC ALGORITHMS), GP (GENETIC PROGRAMMING), ES (EVOLUTION STRATEGIES), EP (EVOLUTIONARY PROGRAMMING), LCS (LEARNING CLASSIFIER SYSTEMS), GE (GRAMMATICAL EVOLUTION), GEP (GENE EXPRESSION PROGRAMMING), DE (DIFFERENTIAL EVOLUTION), ETC. Grammatical Evolution was utilized, using the libGE library, but the same process can be used for GP. ------------------------------------------------------------------------------- 11. THE DATE OF PUBLICATION OF EACH PAPER. IF THE DATE OF PUBLICATION IS NOT ON OR BEFORE THE DEADLINE FOR SUBMISSION, BUT INSTEAD, THE PAPER HAS BEEN UNCONDITIONALLY ACCEPTED FOR PUBLICATION AND IS N PRESS BY THE DEADLINE FOR THIS COMPETITION, THE ENTRY MUST INCLUDE A COPY OF THE DOCUMENTATION ESTABLISHING THAT THE PAPER MEETS THE "IN PRESS" REQUIREMENT. Paper-1: 25 March 2021 Paper-2: 04 November 2020