This title appears in the Scientific Report :
2014
Distinct plasticity mechanisms in the basal ganglia and their functional role in reinforcement learning
Distinct plasticity mechanisms in the basal ganglia and their functional role in reinforcement learning
Aiming towards a minimal realistic circuitry model of learning from both appetitive and aversive outcomes, we implemented a spiking actor-critic network model of the basal ganglia that incorporates different plasticity mechanisms and segregates both the dorsal and ventral striatum into populations o...
Saved in:
Personal Name(s): | Jitsev, Jenia (Corresponding Author) |
---|---|
Tittgemeyer, Marc / Morrison, Abigail | |
Contributing Institute: |
Computational and Systems Neuroscience; IAS-6 Computational and Systems Neuroscience; INM-6 |
Published in: | 2014 |
Imprint: |
2014
|
Conference: | 9th European Forum of Neuroscience, Milano (Italy), 2014-07-05 - 2014-07-09 |
Document Type: |
Conference Presentation |
Research Program: |
W2/W3 Professorinnen Programm der Helmholtzgemeinschaft Neural network mechanisms of reinforcement learning Supercomputing and Modelling for the Human Brain (Dys-)function and Plasticity Signalling Pathways and Mechanisms in the Nervous System |
Publikationsportal JuSER |
Aiming towards a minimal realistic circuitry model of learning from both appetitive and aversive outcomes, we implemented a spiking actor-critic network model of the basal ganglia that incorporates different plasticity mechanisms and segregates both the dorsal and ventral striatum into populations of D1 and D2 medium spiny neurons (MSNs). This segregation allows explicit, separate representation of both positive and negative expected outcome of a given environmental state by the respective population of D1 or D2 MSNs that we hypothesize to reside in the shell region of Nucleus Accumbens. Based on experimental evidence, D1 and D2 MSN populations were assumed to have distinct, opposing dopamine-modulated bidirectional synaptic plasticity.We implemented the network in the NEST simulator and performed experiments involving the application of delayed reward and punishment in a grid world setting, in which a moving agent must reach a goal state while maximizing the total reward obtained. We demonstrate that the network can learn both to approach the delayed positive reward and to consequently avoid punishment. The model highlights thus the functional role of D1/D2 MSN segregation within the striatum in implementing appropriate temporal difference(TD)-like learning from both reward and punishment and explains necessity for opponent direction of DA-dependent plasticity found at synapses converging on distinct striatal MSN types. The approach can be further extended to study how abnormal D1/D2 plasticity may lead to a reorganization of the basal ganglia network towards pathological, dysfunctional states, like for instance those observed in Parkinson disease under condition of progressive dopamine depletion. |