There is a class of AI called reinforcement learning (RL), which works by taking actions in an environment and then learning from the outcome. RL agents have become extraordinarily powerful in recent years, beating human experts in chess and Go, learning to play video games, or learning to control robotic systems. In every case, RL has achieved these feats by maximizing a reward signal from the environment: points in a game, for instance, or proximity to a goal.
We wanted to see if RL could play a different kind of game. Could an artificial agent improve behaviors in an animal by learning to control a living nervous system?
In a recent paper, we used the nematode worm Caenorhabditis elegans to find out. C. elegans grows up to be about a millimeter long and has only 302 neurons. Because scientists have been studying this animal for a long time, we had a lot of tools at our disposal, including a way to modulate C. elegans neurons with light called optogenetics. We could genetically modify animals so that certain sets of neurons were activated by blue or green LEDs, which provided a set of controls that an RL agent could learn to use.
We wanted to see whether agents could use these controls to navigate animals to targets. In each trial, we put a single C. elegans in a 4 cm arena and chose a random target location. RL agents were fed images of the worm from a camera while continuously making decisions about whether to turn a light on or off, and this allowed agents to interact with animals’ nervous systems in real time. We found that with a few hours of training data, agents could not only direct animals toward some goal, but they could also learn to do it for different sets of neurons, even though some of these neuronal sets had opposite effects on nematode behavior. For instance, some sets made animals go forward faster, and other sets made them reverse. In bigger nervous systems, neuron identities and their roles in behavior can vary a lot between individuals, so we were excited to see that the agents could tailor themselves to their connections.
Could we then discover what our RL agents had learned? We gave agents simulated animal states to see the decisions they would make in each case and mapped out agents’ policies for each set of neurons they’d learned to use. These policy maps showed how different neurons influenced directed movement in the worm, in a causal way, making our system a potentially useful tool for neuroscience.
Finally, we wanted to know what kind of animal-AI system we had built. Was the animal just a robot that the AI could control? Or was there cooperativity between the living and artificial neural networks?
We first put paper obstacles between the worm and a patch of food and asked the RL agent to navigate the animal to the patch. The agent didn’t know about the obstacles, but the worm could still use its sensory and motor systems to swim around them and make it to its target. In another experiment, we asked RL agents to direct animals to a location slightly away from food. In this case, when targets were close enough for the animals to sense food, the animals would reach the RL agents’ targets but then leave for the food patch, showing that they could override the agents’ signals if they had enough reason to do so. We were quite happy with these outcomes—animals are already very good at what they do, and we only wanted to be able to guide them.
Overall, our C. elegans-AI system was able to help animals find food better through neural integration, and we used the technology to learn how different parts of the C. elegans nervous system were involved in generating directed movement. For future work, we think there are possible applications in healthcare; for example, algorithms like ours could improve deep brain stimulation treatments for people with Parkinson’s Disease by tailoring stimulation patterns to improve symptoms at the individual level.