This title appears in the Scientific Report :
2014
Functional role of opponent, dopamine modulated D1/D2 plasticity in reinforcement learning
Functional role of opponent, dopamine modulated D1/D2 plasticity in reinforcement learning
The basal ganglia network is thought to be involved in adaptation oforganism's behavior when facing its positive and negative consequences,that is, in reinforcement learning. It has been hypothesized thatdopamine (DA) modulated plasticity of synapses projecting from differentcortical areas to t...
Saved in:
Personal Name(s): | Weidel, Philipp (Corresponding Author) |
---|---|
Morrison, Abigail / Jitsev, Jenia | |
Contributing Institute: |
Computational and Systems Neuroscience; INM-6 Computational and Systems Neuroscience; IAS-6 |
Published in: | 2014 |
Imprint: |
2014
|
Conference: | NeuroVisionen 10, Jülich (Germany), 2014-09-26 - 2014-09-26 |
Document Type: |
Conference Presentation |
Research Program: |
W2/W3 Professorinnen Programm der Helmholtzgemeinschaft Neural network mechanisms of reinforcement learning Theory, modelling and simulation Signalling Pathways and Mechanisms in the Nervous System |
Publikationsportal JuSER |
The basal ganglia network is thought to be involved in adaptation oforganism's behavior when facing its positive and negative consequences,that is, in reinforcement learning. It has been hypothesized thatdopamine (DA) modulated plasticity of synapses projecting from differentcortical areas to the input nuclei of the basal ganglia, the striatum,plays a central role in this form of learning, being responsible forupdating future outcome expectations and action preferences. In thisscheme, DA transmission is considered to convey a prediction errorsignal that is generated if internal expectations do not match theoutcomes observed after action execution. Aiming towards a model of a canonical circuit for learning task-conformbehavior from both reward and punishment, we extended a previouslyintroduced spiking actor-critic network model of the basal ganglia [1]to contain the segregation of both the dorsal (actor) and ventral(critic) striatum into populations of D1 and D2 medium spiny neurons(MSNs) [2]. This segregation allows explicit, separate representation ofboth positive and negative expected outcomes by the distinct populationsin the ventral striatum. The positive and negative components ofexpected outcome were fed to dopamine (DA) neurons in SNc/VTA region,which compute and signal reward prediction error by DA release. In the dorsal striatum, we implemented a winner-takes-all (WTA)circuit to choose between a number of possible actions. We show that plasticity, modulated by D1 and D2 receptors, combined withWTA mechanism results in a TD-Learning like funcional circuit. This modeling approach can be extended in the future work to study howabnormal D1/D2 plasticity may lead to a reorganization of the basalganglia network towards pathological, dysfunctional states, like forinstance those observed in Parkinson disease under condition ofprogressive dopamine depletion.[1] Potjans, W., Diesmann, M. and Morrison, A. An imperfect dopaminergicerror signal can drive temporal-difference learning. PLoS Comput. Biol.,2011, 7[2] Alexander, M. E., and J. R. Wickens. "Analysis of striatal dynamics:the existence of two modes of behaviour." Journal of theoretical biology163.4 (1993): 413-438. |