reya update: evolution as a primary learning mechanism isn't going to work

Nov 20, 2025 - ⧖ 2 min

during the past 2 weeks, i have been experimenting with using neuroevolution to make the AI agents explore the grid, by keeping track of what grid cells the AI agent's raycasts hit, and giving them a score proportional to the ratio of total grid cells to seen grid cells.

sometimes it worked and the fitness did increase, but with a population of a few thousand, it never went above 0.3% with ~200 internal neurons. it very often increased steadily and very slowly then rapidly decreased, even as the population count was increased to impractically high counts. tuning other hyperparameters seemed to have negligable impact.

i believe the reason why is that the cost function is highly discontinuous and that the vector being optimized has internal * output elements, with 3 output neurons (forward, rightward, theta). i thought utilization of signed values in the forward and rightward values may be increasing difficulty, so i tried it with an argmax approach (forward, backward, rightward, leftward + theta), which appeared to have an insiginificant effect on long-term performance.

i figure the best approach is to let the evolution simulation evolve the architecture or neuron connectivity patterns, and have an unsupervised and online learning algorithm handle weight tuning. something like plasticity in neuromorphic computing.

i believe i have two options for the network:

SNN with actual STDP and possibly STP
continuous neurons with dynamics derived from plasticity (i am aware of many, but some can be ruled out instantly like oja's rule)

and two options for how to use that network:

keep the echo state network and put the plastic network in the output layer
abandon reservoir computing entirely and use the plastic network as the main network