A shorter summary (written by me) can be found here.
State-of-the-art Encoding We propose a novel method to store large parameter vectors compactly by representing each parameter vector as an initialization seed plus the list of random seeds that produce the series of mutations applied to theta.
This innovation was essential to enabling GAs to work at the scale of deep neural networks, and we thus call it a Deep GA.
Deep Neural Networks (DNN) are typically optimised using gradient based methods such a back-propagation.
This paper shows that using a simple Genetic Algorithm (GA) it is possible to optimise DNNs and that GAs are a competitive alternative to gradient based methods when applied to Reinforcement Learning (RL) tasks such as learning to play Atari games, simulated humanoid locomotion or deceptive maze problems.
Additionally, by ingeniously representing the DNN parameters as a series of random seeds, a network with over 4 million parameters can be encoded in just a few thousand bytes (a factor of 10,000-fold smaller).
Research Paper On Genetic Algorithm Is There A Argumentative And Persuasive Essays
Novelty Search (NS), a technique used with GAs to encourage exploration in tasks with deceptive or sparse rewards, was shown in one problem domain (called ‘Image Hard Maze’) to outperform other gradient and evolutionary based algorithms that optimise solely for reward.In April 2018, the code was optimised to run on a single personal computer.The work to achieve this is described in an Uber AI labs blog post, and the specific code can be found here.Unlike these methods, GAs do not follow the gradient, instead the use exploration through mutation and exploitation through selection to evolve the individuals of a population.This work uses a very simple GA, with the purpose being to establish a baseline for EAs.Popular algorithms in RL such as Q-learning and policy gradients use gradient descent.ES also follows the gradient via an operation similar to finite differences.NS avoids these local optima by ignoring the reward function during evolution and instead rewarding agents for performing behaviors that have never been performed before (i.e. NS requires a function that can quantify the difference between behavior of any two individuals.After each generation, individuals have a small probability of being added to an archive.It is for this reason that non gradient based methods such as GAs can perform well compared to other popular algorithms in RL.For an introduction to GAs, I recommend Vijini Mallawaarachchi’s blog post.