William James, the “father of American psychology”, was amused when he watched an alligator cautiously approach a human – only to quickly retreat. The alligator repeated this behavior (approaching and retreating) again and again, oscillating between the two extremes like a pendulum. Perhaps the alligator had never encountered a human before, and was unsure of whether the human was a potential source of food or a threat. William James described this response to a novel stimulus (in this case, the human) as an “unstable equilibrium” of “weal or woe” or of “curiosity and fear” (The Principles of Psychology, 1980).
The notion that dopamine plays a crucial role in approach behaviors is widely accepted. Electrophysiology studies have repeatedly demonstrated that dopamine neurons signal reward prediction error (RPE). Reinforcement learning theory claims that RPE is used as a teaching signal to reinforce behaviors, such as approach, that yield future rewards. However, recent studies have suggested that dopamine neurons are more diverse than previously assumed – although the function of non-canonical dopamine neurons was unknown. We approached this issue by first extensively screening subpopulations of dopamine neurons defined by their projection targets (Menegas et al., 2015, 2017). The striatum is the main target recipient of dopamine projections in the brain, and we examined subareas within the striatum in addition to other brain areas. We found that dopamine neurons projecting to the posterior tail of the striatum (TS) were outliers among many other dopamine neurons, both in terms of connectivity and activity (Menegas et al., 2015, 2017). These findings strongly suggested that TS-projecting dopamine neurons have a unique function. In the present study, we focused on this specific population of dopamine neurons.
Conventionally, it was thought that dopamine neuron activity was correlated with reward value, guiding animals to seek out large rewards. Surprisingly, we found that the activity of TS-projecting dopamine neurons was not modulated by reward value, but instead covaried with the intensity and novelty of many kinds of environmental stimuli such as light, tone, odor, and air puff. This activity pattern suggested that TS-projecting dopamine neurons function in qualitatively different manners than canonical dopamine neurons.
Next, we manipulated TS-projecting dopamine neurons to examine their function. When mice chose between water ports that had equal amounts of water (with or without optogenetic activation of canonical dopamine neurons), they preferentially chose the water port with optogenetic activation, consistent with the idea that dopamine strengthens approach behaviors. However, between water ports with or without optogenetic activation of TS-projecting dopamine neurons, mice preferentially chose the water port without optogenetic activation. In other words, mice avoided activation of TS-projecting dopamine neurons. Similarly, whereas control mice tended to avoid water ports with an aversive air puff, mice whose TS-projecting dopamine neurons were ablated chose water ports with and without air puffs equally. Importantly, initial responses to air puff were intact in the lesion animals. Together, our data showed that TS-projecting dopamine neurons reinforce the choice to avoid threatening stimuli.
In addition to the aforementioned choice tasks, we tested the function of TS-projecting dopamine neurons during the exploration of a novel object. Like the alligator that William James described, mice display cycles of approach and retreat in response to a novel object. During this behavior, dopamine axons in TS were active when mice were in the vicinity of the novel object. Optogenetic excitation and ablation of TS-projecting dopamine neurons promoted and inhibited retreat, respectively. This illustrated that TS-projecting dopamine neurons were excited by novel stimuli not because novelty was rewarding, but because it suggested a potential threat. Importantly, mice in which TS-projecting dopamine neurons were ablated showed similar levels of initial retreat, but stopped retreating more quickly than control mice. These results indicate that TS-projecting dopamine neurons are not necessary for initial avoidance, but instead play a role in the reinforcement or maintenance of avoidance. Thus, our findings support the classical notion that dopamine plays an important role in reinforcement learning, but also reveal that, while canonical dopamine neurons reinforce reward-driven actions, TS-projecting dopamine neurons instead reinforce avoidance of threatening stimuli.
In summary, our findings show that there are at least two axes that provide reinforcement learning using dopamine in the striatum: the reward axis and the threat axis. Thus, in order to interpret normal and abnormal behaviors, one must consider the balance of multiple dopamine systems. Dopamine regulates pushes and pulls. As William James pointed out, both components of the unstable equilibrium are essential for adaptive behaviors. We hope that our study will open doors to investigate how multiple dopamine-striatum systems cooperate or compete to shape behaviors.