Are genetic algorithms for neural networks coming back?
Wanna become a data scientist within 3 months, and get a guaranteed job? Then you need to check this out !
Genetic algorithms: A bit of history
Genetic algorithms used to be a very popular branch of algorithms during the 90s early 00s. Genetic algorithms used to be one of the cornerstones of what is called computational intelligence. Computational intelligence is a subfield of artificial intelligence based mainly on the idea of creating intelligence by mimicking natural processes. For example, swarms display a certain kind of intelligence as they self-organise in response to hazards. Similarly, evolution is an optimisation algorithm running in nature, that has come up with ingenious ways to preserve life. Our brain is also a biological machine that displays intelligence, and we could mimic its neural processes in order to try and recreate intelligence in silico.
For those of you who don’t know what a genetic algorithm is, it is basically an optimisation method that mimics evolution. You have a set of candidate solutions (called a population). The best solutions get combined with each other (mating) and also some of those get randomly mutated. This lets the algorithm both use good solutions, but also explore new options. Genetic algorithms became famous through the work of Holland, who, in 1975, wrote up the book Adaptation in Natural and Artificial Systems. Other evolutionary inspired algorithms showed up through the years, such as evolution strategies and particle swarm optimisation. The video below by MATLAB explains how genetic algorithms work, for any of you who are curious.
However, genetic algorithms at some point fell out of fashion, as did computational intelligence. One major reason behind that was the lack of a robust theory behind them. While the evolutionary analogy seems attractive in the beginning, it started seeming weak compared to other developments in machine learning. Probability theory was employed to offer theoretical explanations as to why many machine learning models work (read Murphy’s book for an excellent book on the topic). Support vector machines offered an alternative to neural networks, based on learning theory.
Genetic algorithms and neural networks
Genetic algorithms used to be a popular method for training neural networks. For example, one of the early papers in that area “Training Feedforward Neural Networks Using Genetic Algorithms” published in 1989, has been cited more than 1200 times. However, after falling out of fashion, when deep learning took off, genetic algorithms were not really mentioned. Gradient descent methods were favoured instead, not only because they worked well, but because it was also possible to explain why they worked. I remember when I was an MSc student at the University of Edinburgh, someone asked the lecturer whether using genetic algorithms to train a neural network would be a good idea. The lecturer responded that no-one really did that any more, as the whole idea behind genetic algorithms was based on a loose analogy, and it was preferable to run a mathematically well grounded method (gradient descent).
Therefore, I was happily surprised (as a fan of evolutionary computation), when I was recently looking into a report by the Uber AI labs with the title “Welcoming the Era of Deep Neuroevolution“, explicitly discussing genetic algorithms as an interesting alternative to gradient descent algorithms for deep neural network training in the context of reinforcement learning. Specifically, UberAI reports that a simple genetic algorithm could provide better performance than deep q-learning, while also being very fast due to parallelisation. Some of their innovations include a new way to mutate weights that takes gradient information into account, and a study on the relationship between the gradients produced by an evolution strategy and gradient descent. They have also released an open source project that makes the execution of genetic algorithms for neuroevolution so efficient, that it becomes possible to run an experiment on a laptop within a few hours (whereas it would normally take up days). The video below by UberAI shows how an evolution strategy algorithm can surpass the performance of gradient descent in the case of a narrow gap of low fitness.
All this started from research by OpenAI. In that blog post Karpathy et al. write
Our finding continues the modern trend of achieving strong results with decades-old ideas. For example, in 2012, the “AlexNet” paper showed how to design, scale and train convolutional neural networks (CNNs) to achieve extremely strong results on image recognition tasks, at a time when most researchers thought that CNNs were not a promising approach to computer vision. Similarly, in 2013, the Deep Q-Learning paper showed how to combine Q-Learning with CNNs to successfully solve Atari games, reinvigorating RL as a research field with exciting experimental (rather than theoretical) results. Likewise, our work demonstrates that ES achieves strong performance on RL benchmarks, dispelling the common belief that ES methods are impossible to apply to high dimensional problems.
genetic algorithms and artificial intelligence
Something I find quite interesting indeed, is that ideas in artificial intelligence some times seem to be running in circles. A simple idea turns into huge hype, then disappears, and the resurfaces. I think that in computer science there is a tendency for things to get overhyped, then forgotten when the hype does not live up to promise, and then resurface when the conditions are right (more data, more computational power or new use cases). In the case of genetic algorithms it was a combination of increased computational power (which allows greater parallelisation), and the re-emergence of reinforcement learning.
For example, fuzzy logic is another useful invention, that has been successfully used in control systems, and is not much talked about anymore in academia. However, something I’ve learned over the years, is that when something falls out of fashion, but is still implemented in software such as MATLAB, it means that there might be nothing else to research any more, specifically because the technology has matured. Even when neural networks had fallen out of fashion, they were still a very good tool for regression and classification. So, I guess one takeaway from all this is to never discredit technologies, simply because it is not talked about much anymore. Sometimes this can be an indication that there is nothing else to research, rather than the technology is not useful any more.
Regarding genetic algorithms, I am fairly excited about these new developments, as I really believe that they are a very powerful concept. Also, the recent advances in cloud computing and computational power can allow the execution of massively parallel jobs, making them incredibly efficient. We might also start seeing a resurgence of genetic algorithms niches again such as genetic programming and generative art. My own project around data science automation, ADAN.io, is based partly on genetic algorithms. So, stay tuned to that space as many more developments are likely to come soon!
If you are interested to hear more about optimisation and predictive analytics make sure to check my videos below: