Análisis Estadístico de Algoritmos Evolutivos para el problema de Selección de Variables

  • Victor Adrian Jimenez Grupo de Investigación en Tecnologías Informáticas Avanzadas, Universidad Tecnológica Nacional - Facultad Regional Tucumán
  • Diego Fernando Lizondo Grupo de Investigación en Tecnologías Informáticas Avanzadas, Universidad Tecnológica Nacional - Facultad Regional Tucumán
  • Adrián Will Grupo de Investigación en Tecnologías Informáticas Avanzadas, Universidad Tecnológica Nacional - Facultad Regional Tucumán
Keywords: evolutionary computation, variable selection, genetic algorithm, linear regression, statistical analysis

Abstract

Decades of research in optimization problems have generated a considerable number of algorithms, both deterministic and heuristic. However, due to this wide range of possibilities, determine which one is the most appropriate for a specific problem is a complex task. In this paper, a comparison among different heuristic optimization algorithms using statistical test is proposed. Simulated Annealing (SA), Simple Genetic Algorithm (sGA), Genetic Algorithms Compact (cGA) and Deterministic Crowding (DC) were used, applied to the Variable Selection for estimation problem using Linear Regression. Three test cases were used: solar radiation at the province of Tucuman (Argentina), power consumption estimation in the same area, and estimation of the reappearance of cancer cells. We concluded that there is sufficient statistical evidence to affirm that the algorithms yield significantly different results. Also, we concluded that sGA and DC were the most suitable algorithms, obtaining similar fitness values, being sGA slightly better.

Downloads

Download data is not yet available.

References

Ahn, C. W., & Ramakrishna, R. S. (2003). Elitism-based compact genetic algorithms. Evolutionary Computation, IEEE Transactions on, 7(4), 367-385. https://doi.org/10.1109/TEVC.2003.814633

Coffin, M., & Saltzman, M. J. (2000). Statistical analysis of computational tests of algorithms and heuristics. INFORMS Journal on Computing, 12(1), 24–44. https://doi.org/10.1287/ijoc.12.1.24.11899

De Jong, K. A. (1975). Analysis of the behavior of a class of genetic adaptive systems (PhD Thesis). University of Michigan.

Denardo, E. V. (2012). Dynamic programming: models and applications. Courier Corporation.

Derrac, J., García, S., Molina, D., & Herrera, F. (2011). A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm and Evolutionary Computation, 1(1), 3-18. https://doi.org/10.1016/j.swevo.2011.02.002

Eldén, L. (2007). Matrix Methods in Data Mining and Pattern Recognition. Philadelphia, PA, USA: Society for Industrial and Applied Mathematics.

Harik, G. R., Lobo, F. G., & Goldberg, D. E. (1999). The compact genetic algorithm. Evolutionary Computation, IEEE Transactions on, 3(4), 287–297. https://doi.org/10.1109/4235.797971

Hauschild, M., & Pelikan, M. (2011). An introduction and survey of estimation of distribution algorithms. Swarm and evolutionary computation, 1(3), 111–128. https://doi.org/10.1016/j.swevo.2011.08.003

Jimenez, V. A., Lizondo, D., Will, A., & Rodriguez, S. (2017). Short-Term Load Forecasting for Low Voltage Distribution Lines in Tucumán, Argentina. 5to Congreso Nacional de Ingeniería Informática / Sistemas de Información (CoNaIISI 2017), 940-949. Santa Fe, Argentina.

Kirkpatrick, S. (1984). Optimization by simulated annealing: Quantitative studies. Journal of statistical physics, 34(5-6), 975–986. https://doi.org/10.1007/BF01009452

Kotu, V., & Deshpande, B. (2015). Chapter 12 - Feature Selection. En V. Kotu & B. Deshpande (Eds.), Predictive Analytics and Data Mining (pp. 347-370). https://doi.org/10.1016/B978-0-12-801460-8.00012-4

Mahfoud, S. W. (1995). Niching methods for genetic algorithms. Urbana, 51(95001), 62–94.

Parsopoulos, K. E. (2010). Particle swarm optimization and intelligence: advances and applications: advances and applications. IGI global.

Peng, X., Bessho, M., Koshizuka, N., & Sakamura, K. (2014). A framework for peak electricity demand control utilizing constraint programing method in smart building. 2014 IEEE 3rd Global Conference on Consumer Electronics (GCCE), 744–748. https://doi.org/10.1109/GCCE.2014.7031211

Talbi, E.-G. (2009). Metaheuristics: from design to implementation (Vol. 74). John Wiley & Sons.

Varmuza, K., & Filzmoser, P. (2008). Introduction to multivariate statistical analysis in chemometrics. CRC press.

Will, A., Bustos, J., Bocco, M., Gotay, J., & Lamelas, C. (2013). On the use of niching genetic algorithms for variable selection in solar radiation estimation. Renewable Energy, 50, 168–176. https://doi.org/10.1016/j.renene.2012.06.039

William H. Wolberg, O. L. M. (1992). Wisconsin Breast Cancer Database [https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+ (Prognostic)].

Williams, M., & Daneshyari, M. D. (2016). Clonal vs. Negative Selection in Artificial Immune Systems (AIS).

Yu, X., & Gen, M. (2010). Introduction to Evolutionary Algorithms. En Decision Engineering. Springer.

Published
2020-05-10
Section
Articles