Towards Improved Experimental Evaluation and Comparison of Evolutionary Algorithms


Bo Yuan

PhD Thesis, The University of Queensland, Australia, 2006



Despite the continuous advancement of Evolutionary Algorithms (EAs) and their numerous successful applications to a variety of optimization problems over the past decades, a fundamental issue has received relatively little attention: the development of methodologies and tools for the evaluation and comparison of EAs. Due to their massive parallel and inherent stochastic behaviour, theoretical analysis of EAs has proven to be highly challenging. While progress has been made, theoretical analysis remains largely restricted to scenarios where significant simplifying assumptions must be made with respect to the algorithms and/or the problems in question.  Consequently, EAs have mostly been evaluated on an empirical basis with a number of widely adopted benchmark problems.

There is some conventional practice of experimental evaluation and comparison of EAs in the literature. However, the methodologies and techniques employed often produce results with low scientific value and lead to conclusions that are not sufficiently supported by those results. It is therefore unclear whether such results and conclusions have any consequences for future research or general application of the algorithms applied. Two specific issues are particularly relevant to this thesis.  Firstly, it has been recognized that many of the commonly used benchmark problems in the EA community have some unfavourable features for testing the strengths and weaknesses of EAs. In the meantime, the structure of these problems is usually fixed and specified in isolation, making it difficult to compile sets of experimental results where common properties of problems vary in a controlled way, in order to generalize the performance of EAs to other unknown problems. Secondly, each EA is specified by a set of parameters whose values may have a significant impact on its performance. In the literature, parameter values are often selected with little justification or based on some hand-tuning towards a small group of benchmark problems, without conducting proper experimental exploration of the parameter space. The major issue is that the corresponding experimental results are unlikely to be plausible as they are highly restricted to the specific parameter setting and test problems in use.

This thesis is dedicated to the methodologies and techniques for conducting rigorous and principled empirical evaluation of EAs, with particular focus on the issues of test optimization problems and experiments over parameter settings of EAs. Most of the experiments in this thesis are conducted using EAs known as Estimation of Distribution Algorithms (EDAs), which are based on explicit statistical modelling of the problem space. Although the attention is focused on continuous EDAs and optimization problems, the major contributions of this thesis are also equally applicable to other domains. 

Firstly, the continuous EDAs used in this thesis are examined to identify their key features. A number of modifications to the standard algorithms are proposed, which may provide improved performance. Note that the major focus here is not to develop new EDAs and argue their general performance over existing algorithms. Instead, the intention is to demonstrate how carefully designed experiments can be used to effectively investigate the properties of these algorithms. Furthermore, a mathematical modelling technique is proposed, which can precisely estimate the detailed dynamics of a simple EDA during evolution.

Secondly, a continuous landscape (test problem) generator based on multivariate Gaussian components (LG-MVG) is proposed, which is capable of generating a large number of random problem instances with predefined structure in accordance with the specific objective of experimental studies. The major advantage of LG-MVG is that it can generate a variety of problems in terms of landscape structure and is parameterized by a flexible set of parameters, which have an intuitive impact on the structure of the problems generated. Experimental studies are conducted to demonstrate the use of LG-MVG in the empirical evaluation of EDAs and reveal some of their important properties that would otherwise be difficult to observe.

Thirdly, a statistical Racing technique is improved and applied in the evaluation of EAs, to significantly reduce the computational overhead incurred by large-scale experiments involving multiple algorithm instances with different parameter settings and/or multiple test problems. In the cases studies, Racing can often reduce the cost of exhaustive experimentation by around 90% while maintaining the reliability of results. Furthermore, in order to overcome the lack of exploration mechanism in Racing, a hybrid technique combining Racing and Meta-EAs is proposed and is applied to the parameter tuning task of EAs. Experimental results show that this technique can effectively take advantage of both the efficiency and the encoding-free feature of Racing as well as the exploration ability of Meta-EAs.

Finally, a challenging real-world engineering design problem is used to provide some empirical evidence on the effectiveness of different EDAs. More importantly, instead of being regarded as a black-box problem, the structure of this problem is characterized by a statistical technique, which can reveal some important properties of the problem at a small fraction of the cost of an extensive exploration. Statistical analysis shows that this design problem is highly multimodal and is unlikely to be efficiently solved by local searching methods. The importance of capturing dependences among variables in practical problems is also highlighted.

Advisors: Dr. Marcus Gallagher     Professor Tom Downs

Download: PDF (3.37 MB, 190 Pages)