Modelling formulations using gene expression programming – A comparative analysis with artificial neural networks

doi:10.1016/j.ejps.2011.08.021

European Journal of Pharmaceutical Sciences

Volume 44, Issue 3, 9 October 2011, Pages 366-374

https://doi.org/10.1016/j.ejps.2011.08.021 Get rights and content

Abstract

This study has investigated the utility and potential advantages of gene expression programming (GEP) – a new development in evolutionary computing for modelling data and automatically generating equations that describe the cause-and-effect relationships in a system- to four types of pharmaceutical formulation and compared the models with those generated by neural networks, a technique now widely used in the formulation development. Both methods were capable of discovering subtle and non-linear relationships within the data, with no requirement from the user to specify the functional forms that should be used. Although the neural networks rapidly developed models with higher values for the ANOVA R² these were black box and provided little insight into the key relationships. However, GEP, although significantly slower at developing models, generated relatively simple equations describing the relationships that could be interpreted directly. The results indicate that GEP can be considered an effective and efficient modelling technique for formulation data.

Introduction

Formulation development is a complicated problem set in a design space that is multidimensional in nature and difficult to conceptualise. In order to reduce the development timescale within a framework of quality by design research into advanced computational techniques have resulted in the use of expert and knowledge base systems for the generation of initial formulations (Rowe and Roberts, 1998) and data mining for modelling formulation and process data. (Ekins, 2006, Balakin, 2010). Statistical methods have been used for over thirty years in pharmaceutical formulation to derive equations expressing the relevant cause-and-effect relationships, and over the last fifteen years artificial neural networks (ANN) have gained increasing usage in this area as a means of developing useful models from experimental data allowing predictions to be made and formulations optimised (Achanta et al., 1995, Bourquin et al., 1997, Takayama et al., 2003, Colbourn and Rowe, 2005). Neural networks cope well with complex non-linear relationships with the added advantage that the functional form between the dependent and independent variables need not be chosen a priori as with statistical techniques. However, the developed models are invariably ‘black-boxes’ and difficult to interpret except indirectly by examining response surfaces.

In order to overcome this specific disadvantage Do et al. (2008) proposed the use of a new technique – genetic programming (GP). This method based on evolutionary computing (Koza, 1998) is also able to generate mathematical equations and Do et al. (2008) showed that it could be applied to modelling drug dissolution from controlled release formulations. They showed that it was capable of not only providing equations relating the variables that could be interpreted directly but also the models exhibited comparable predictive power to statistics. The results support work carried out by Gusel and Brezocnik (2006) on modelling the impact toughness of a copper alloy; although in their case the GP models were more precise than the statistical models, the equations were very complex.

In this paper gene expression programming (GEP), an extension of GP recently proposed by Ferreira (2001) and claimed (Ferreira, 2006) to produce models more quickly, has been evaluated as a modelling technique for four different types of pharmaceutical formulations and the models compared with those generated using multi-layer perceptron (MLP) neural networks.

Section snippets

Gene expression programming

Because it is a new technique in the pharmaceutical formulation literature, a brief review of Gene Expression Programming is given here. GEP is a development of genetic programming (GP); both are part of the general family of Evolutionary Computing, a methodology in which one or more populations of individual members, each of which provides a possible fit to the data, are generated at random. The fitness of each individual is assessed by seeing how well it fits the training data, and the

Results and discussion

In all cases below, for the GEP modelling the population size was fixed at 1000 and 10 separate populations were considered. Other parameters used in the specific models are as discussed below.

Conclusions

For all the data sets examined here, a careful experimental design had been used in developing the strategy for data collection. One of the strengths of GEP compared to neural networks is that it has the ability to exclude irrelevant inputs, and this is evident from the immediate release tablet example, where two input variables summed to a constant value. In this case, for many of the models, GEP selected one of the inputs and omitted the other.

GEP was capable of developing good models for all

References (22)

A. Baykasoglu et al.
Prediction of cement strength using soft computing techniques
Cem. Conc. Res.
(2004)
A. Bodea et al.
Optimization of hydrophilic matrix tablets using a D-optimal design
Int. J. Pharm.
(1997)
M. Brezocnik et al.
Prediction of surface roughness with genetic programming
J. Mat. Proc. Technol.
(2004)
D.Q. Do et al.
Modelling drug dissolution from controlled release products using genetic programming
Int. J. Pharm.
(2008)
M.C. Gohel et al.
Formulation optimization of controlled release diclofenac sodium microspheres using factorial design
J. Controlled Release
(1998)
L. Gusel et al.
Modeling of impact toughness of cold formed material by genetic programming
Comp. Mat. Sci.
(2006)
J. Rissanen
Modeling by shortest data description
Automatica
(1978)
K. Takayama et al.
Neural network based optimization of drug formulations
Adv. Drug Delivery Rev.
(2003)
A.S. Achanta et al.
Artificial neural networks: Implications for pharmaceutical sciences
Drug Dev. Ind. Pharm
(1995)
Balakin, K.V. 2010 Pharmaceutical Data Mining, John Wiley & Sons Inc., New...

J. Bourquin et al.

Basic concepts of artificial neural networks (ANN) modelling in the application to pharmaceutical development

Pharm. Dev. Technol.

(1997)

Cited by (30)

Applications of Machine Learning in Solid Oral Dosage Form Development
2021, Journal of Pharmaceutical Sciences
This review comprehensively summarizes the application of machine learning in solid oral dosage form development over the past three decades. In both academia and industry, machine learning is increasingly applied for multiple preformulation/formulation and process development studies. Further, this review provides the authors’ perspectives on how pharmaceutical scientists can use machine learning for right projects and in right ways; some key ingredients include (1) the determination of inputs, outputs, and objectives; (2) the generation of a database containing high-quality data; (3) the development of machine learning models based on dataset training and model optimization; (4) the application of trained models in making predictions for new samples. It is expected by the authors and others that machine learning will promisingly play a more important role in tomorrow's projects for solid oral dosage form development.
Release modeling of nanoencapsulated food ingredients by artificial intelligence algorithms
2020, Release and Bioavailability of Nanoencapsulated Food Ingredients
This chapter introduces the concept of artificial intelligence algorithms and the potential for their application in the field of biomedicine and nutrition, especially in the context of nanoencapsulated food ingredients. Amongst many artificial intelligence algorithms, several are selected and their potential for release modeling is described in more detail. These are namely artificial neural networks, adaptive neuro-fuzzy inference systems, and genetic algorithms. Apart from the theoretical background on their operating principles, numerous examples are selected and briefly discussed in order to present to reader different aspects for their potential application. In spite of usually being treated as the “black-box” modeling tools, these algorithms actually do provide the opportunities to analyze the causal relationships between the data or perform sensitivity analysis, which is of great importance and value for the release modeling.
Artificial Intelligence Tools for Scaling Up of High Shear Wet Granulation Process
2017, Journal of Pharmaceutical Sciences
Citation Excerpt :
Using Equation 1, impeller power values were predicted for each condition of the mixer operation and wet granule properties measured. Two commercial software packages FormRules® v4.03 and INForm® v5.01 (Intelligensys Ltd., North Yorkshire, UK) which implement neurofuzzy logic and GEP technologies, respectively, were used in this study.11,16 The FormRules model was obtained using results from the PMA 25L, 100L, and 600 L experiments (41 records).
The results presented in this article demonstrate the potential of artificial intelligence tools for predicting the endpoint of the granulation process in high-speed mixer granulators of different scales from 25L to 600L. The combination of neurofuzzy logic and gene expression programing technologies allowed the modeling of the impeller power as a function of operation conditions and wet granule properties, establishing the critical variables that affect the response and obtaining a unique experimental polynomial equation (transparent model) of high predictability (R² > 86.78%) for all size equipment. Gene expression programing allowed the modeling of the granulation process for granulators of similar and dissimilar geometries and can be improved by implementing additional characteristics of the process, as composition variables or operation parameters (e.g., batch size, chopper speed). The principles and the methodology proposed here can be applied to understand and control manufacturing process, using any other granulation equipment, including continuous granulation processes.
Qualitative and quantitative methods to determine miscibility in amorphous drug-polymer systems
2015, European Journal of Pharmaceutical Sciences
Citation Excerpt :
The study also demonstrated that the solubility parameter is limited in predicting miscibility of molten system, since the properties such as the viscosity of the polymers might change significantly during thermal events (Liu et al., 2013). In the pharmaceutical drug development process, data mining have been employed for various purposes including, the understanding of the structure–activity relationships, the prediction of absorption, distribution, metabolism and elimination of drugs, and the prediction of the changes in the solid-state properties of pharmaceutical compounds (Butina et al., 2002; Colbourn et al., 2011; Mahlin et al., 2011; Mendyk et al., 2008). Recently computational data mining have been developed as a theoretical approach to evaluate the drug–excipient miscibility.
Amorphous drug–polymer systems or amorphous solid dispersions are commonly used in pharmaceutical industry to enhance the solubility of compounds with poor aqueous solubility. The degree of miscibility between drug and polymer is important both for solubility enhancement as well as for the formation of a physically stable amorphous system. Calculation of solubility parameters, Computational data mining, T_g measurements by DSC and Raman mapping are established traditional methods used to qualitatively detect the drug–polymer miscibility. Calculation of Flory–Huggins interaction parameter, computational analysis of X-Ray Diffraction (XRD) data, solid state Nuclear Magnetic Resonance (NMR) spectroscopy and Atomic Forced Microscopy (AFM) have been recently developed to quantitatively determine the miscibility in amorphous drug–polymer systems. This brief review introduces and compiles these qualitative and quantitative methods employed in the evaluation of drug–polymer miscibility. Combination of these techniques can provide deeper insights into the true miscibility of the drug–polymer systems.
Application of artificial neural networks (ANNs) and genetic programming (GP) for prediction of drug release from solid lipid matrices
2012, International Journal of Pharmaceutics
The aim of the present study was to develop a semi-empirical mathematical model, which is able to predict the release profiles of solid lipid extrudates of different dimensions. The development of the model was based on the application of ANNs and GP. ANNs' abilities to deal with multidimensional data were exploited. GP programming was used to determine the constants of the model function, a modified Weibull equation. Differently dimensioned extrudates consisting of diprophylline, tristearin and polyethylene glycol were produced by the use of a twin-screw extruder and their dissolution behaviour was studied. Experimentally obtained dissolution curves were compared to the calculated release profiles, derived from the semi-empirical mathematical model.
Establishing and analyzing the design space in the development of direct compression formulations by gene expression programming
2012, International Journal of Pharmaceutics
Citation Excerpt :
However, it has been applied successfully in solving some problems within the engineering and food industry fields in the development of new and better materials (Eskil and Kanca, 2008), the prediction of material properties (Antoniou et al., 2010) and the improvement of food processing (Kahyaoglu, 2008). Recently it has been applied to modeling pharmaceutical formulations (Colbourn et al., 2011) where the GEP approach has been compared to neural networks. Using a desktop computer, researchers can handle by GEP, a large number of variables (inputs and outputs) simultaneously.
In this paper we have evaluated the gene expression programming (GEP) methodology for modeling the effect of different variables (continuous and nominal) and their interactions on the properties of direct compression formulations.
The effect of four variables was studied; variety of diluents, type and percentage of drug and maximum compression force, on the mechanical and drug release properties of direct compression tablets. The generated database (36 formulations) was used for mathematical and GEP modeling.
GEP has been shown to have a high accuracy in prediction for four out five outputs studied including friability which had no replicate measurements. Compared to the traditional statistical treatment GEP is less time consuming and gives equations which are extremely helpful in understanding the interactions of the different variables and for establishing the design space in the development of direct compression formulations.
GEP allows similar conclusions than traditional statistical treatment. The helpfulness of this methodology in establishing the design space has been demonstrated. The knowledge derived from GEP can easily be increased by including additional information or new inputs, such as additional drugs or combinations of excipients in the data set.

View all citing articles on Scopus

View full text

Modelling formulations using gene expression programming – A comparative analysis with artificial neural networks

Abstract

Introduction

Section snippets

Gene expression programming

Results and discussion

Conclusions

Cem. Conc. Res.

Int. J. Pharm.

J. Mat. Proc. Technol.

Int. J. Pharm.

J. Controlled Release

Comp. Mat. Sci.

Automatica

Adv. Drug Delivery Rev.

Artificial neural networks: Implications for pharmaceutical sciences

Drug Dev. Ind. Pharm

Basic concepts of artificial neural networks (ANN) modelling in the application to pharmaceutical development

Pharm. Dev. Technol.