In data analysis, pattern mining aims to discover interesting patterns which represent intrinsic and important properties of datasets. Since the late 1990s, evolutionary algorithms, as an optimisation and machine learning paradigm, has become increasingly important to solve data mining tasks to discover hidden patterns from large datasets and achieved good results. Pattern Mining with Evolutionary Algorithms provides an overview of methods using evolutionary algorithms for discovering interesting patterns. The book is very useful and can potentially attract more people to carry out research and applications in pattern mining using evolutionary algorithms.

Pattern Mining with Evolutionary Algorithms is only 190 pages, but it is self-contained and does not require its readers to have much background knowledge of either evolutionary algorithms or pattern mining. It covers fundamental concepts, contains detailed algorithm descriptions, gives excellent examples, and motivational successful applications. It focuses more on genetic algorithms (GAs) and genetic programming (GP) for mining patterns, rather than all evolutionary algorithms. Nevertheless it is a very good textbook for advanced undergraduate and post-graduate courses. Similarly, it is a great reference book for evolutionary computation researchers who want to work in pattern mining, or data mining practitioners who want to use evolutionary algorithm tools to discover valuable hidden patterns for their tasks.

There are nine chapters, which I have divided into three parts. Chapters 1–3 introduce the basic concepts and background, Chapters 4–6 review recent work on evolutionary algorithms for general pattern mining tasks, and Chapters 7–9 discuss more advanced topics in pattern mining.

The first chapter introduces pattern mining and gives formal definitions of different types of patterns. It provides a taxonomy of patterns and introduces important concepts which are then used in the following chapters. Since pattern mining is computationally expensive on large datasets, Chapter 1 also describes strategies that make the mining process faster by using constraints to prune the search space, and traditional (i.e. non evolutionary) efficient pattern mining methods from 1990s. Furthermore, association rules which provide a way of describing correlations among sets of items are also introduced. The first chapter is very easy to follow. It uses one simple example dataset to explain different concepts and uses many pictures to make complex terms easy to understand.

Chapter 2 describes the most widely used quality measures or metrics in pattern mining. It divides them into two groups: objective metrics and subjective metrics. By subjective metrics it means metrics that also use external knowledge from users in addition to the data properties. Chapter 2 focuses on various objective quality measures and their essential properties. The measures are described clearly and summarised very well by tables and figures. This chapter provides a good foundation for understanding fitness functions.

Chapter 3 provides an introduction to evolutionary computation, focusing mainly on the two most widely used approaches, i.e. GAs and GP. The key parts of GAs and GP are described in detail, such as the general procedures, solution representations, genetic operators, and fitness functions. The text is easy to follow and provides sufficient information about GAs and GP for the novice to understand the later chapters.

Chapters 4, 5, and 6 describe respectively the use of GAs, GP, and evolutionary multi-objective algorithms for pattern mining. Each chapter covers: the general issues, algorithm descriptions, and examples of successful applications. Chapter 4 firstly discusses the three key aspects in GAs for pattern mining: encoding, genetic operators, and fitness functions. Then several GA-based pattern mining methods are described in detail. Chapter 4 is easy to understand, especially for those who have used GAs for other applications. Chapter 4 should be read before Chapter 5 because it provides important concepts and foundation for GP, which are to be described in Chapter 5.

Chapter 5 extensively discusses representations used with GP, since variable-length representation is a key feature of GP compared to GAs and other evolutionary methods. It concentrates mainly on the standard tree-based GP and on grammar-based GPs. The grammar-based representation is particularly interesting in pattern mining since it allows GP to encode a solution based on both the problem itself and on external knowledge from users. It also describes grammar-guided GP for mining several different types of patterns. Chapter 5 contains much more complex ideas than Chapter 4, but it is well-written and easy to understand.

Chapter 6 starts with the fundamental concepts of evolutionary multi-objective optimisation, such as a Pareto optimal front and metrics to evaluate their quality. Next it discusses the quality measures from Chapter 2 that form the multiple potentially conflicting objectives in pattern mining, such as to minimise the number of attributes used in a pattern but also to maximise a strength measure. A strength measure could be how often the pattern occurs or the dependence between the event and the results in a pattern. Two GAs and two GP based multi-objective approaches are also presented. Chapter 6 does not involve complex ideas and, even for readers without a background in multi-objective optimisation, it is easy to understand.

Chapters 7, 8 and 9 are relatively short and each focuses on an advanced topic in pattern mining. They require deeper understanding of pattern mining than the other chapters. However, since the problems are well-defined and the algorithms are described clearly in detail, they are not hard to read. Chapter 7 focuses on supervised local pattern mining (that is, mining patterns from labelled data). It describes in detail how to identify interesting groups of patterns. Quality measures such as complexity and generality are discussed. It also introduces the use of deterministic algorithms and evolutionary algorithms, mainly GAs and GP, in supervised local pattern mining.

Chapter 8 is about exceptional relationship mining whilst Chapter 9 focuses on how to scale pattern mining to very large datasets. Chapter 9 discusses how evolutionary algorithms help solve the scalability issue, and then describes different parallel algorithms and data representations as ways to improve the efficiency of the mining process. It would be even better if it includes more discussions on other major issues, challenges, and future trends.

In conclusion, I had been through an enjoyable experience on reading Pattern Mining with Evolutionary Algorithms. I found it easy to read, well-written and well-structured, very beneficial and important for readers to develop substantial learning. In view of this, I strongly recommend this valuable book.