スケジュール:
09:00-09:50 |
Matthew Roesch(Maryland U) |
10:00-10:50 |
彦坂 興秀 (NIH) |
11:00-11:50 |
John O'Doherty (U. of Dublin) |
アブストラクト・関連論文:
Matthew Roesch
"Dissociation of neural signals related to expected and unexpected reward"
How does the brain guide behavior based on expected outcomes and errors in those expectations? To address this question we have recorded from several brain areas while rats performed an odor discrimination/choice task for rewards of different value. Value was unexpectantly manipulated over the course of several trial blocks by either varying the delay to or size of reward. These manipulations had a profound impact on behavior (Roesch et. al., 2007) and neural activity in orbitofrontal cortex (OFC), basolateral amygdala (BLA) and dopamine neurons in ventral tegmental area (VTA). Dopamine neurons clearly encoded errors in reward prediction, increasing or decreasing firing when reward was better or worse than expected, respectively. Remarkably, reward-responsive neurons in ABL also exhibited elevated firing to an unexpected reward, however, unlike VTA, this change was not strongest during the first trial of unexpected reward delivery when it was most unpredicted, but rather was maximal several trials after the switch. Furthermore, these neurons did not exhibit other signs of error signaling, such as suppression of firing upon omission of an expected reward. This pattern of results is inconsistent with signaling of reward prediction errors and instead suggests that a subpopulation of reward-responsive neurons in ABL signal changes in arousal and attention. Finally, in this task, neural activity in OFC was not modulated by unexpected reward delivery, but instead seemed to reflect the anticipation and delivery of predictable reward, consistent with its proposed role in signaling the expectancy of reward. Together these areas may be critical in updating choice behavior when changes in value occur.
References:
Roesch, M.R., Galu, D. J., and Shoenbaum, G. (2007) Dopamine neurons encode the better option in rats deciding between differently delayed or sezed rewards. Nature Neuroscience 10: 1615-1624
Roesch, M.R., Taylor , A.R. and Shoenbaum, G. (2006) Encoding of Time-Discounted Rewars in Orbitofrontal Cortex Is Independent of Value Representation. Neuron 51: 509-520
彦坂 興秀
Okihide Hikosaka
"Motivational and cognitive control of behavior by the basal ganglia"
The basal ganglia contain parallel neural circuits: direct, indirect, and hyperdirect pathways. A popular hypothesis is that these pathways work in a coordinated manner to select a desired action while suppressing undesired actions. However, critical evidence to support this hypothesis had been lacking until recently. Do the basal ganglia pathways actually work as the hypothesis predicts? In what behavioral contexts do these pathways work? Recent studies using behaving animals and humans have begun to answer these questions. Based on our recent experiments using trained monkeys, I propose the following functions. The direct and indirect pathways select actions based on motivational demands, whereas the hyperdirect pathway selects actions based on cognitive demands. The direct pathway, which consists of two serial inhibitory connections (STR-GPi/SNr-), serves to acquire actions that lead to rewarding outcomes. The indirect pathway, which consists of three inhibitory connections (STR-GPe-GPi/SNr-), serves to discourage actions that lead to punishing or non-rewarding outcomes. Dopaminergic inputs to the STR are crucial in implementing motivational values in the signals transmitted in the basal ganglia. In contrast, the hyperdirect pathway, which is mediated by the direct cortico-STN connections, serves to select one action that is in conflict with others, especially for switching from an automatic action to a controlled action. To summarize, the parallel pathways in the basal ganglia could operate independently from each other to achieve their unique goals. Inputs from outside the basal ganglia, especially those from the lateral habenula and the dorsal raphe (including serotonin neurons), fine-tune or modify the operation of the basal ganglia circuits.
Abbreviations: STR: striatum, GPi: globus pallidus internal segment, SNr: substantia nigra pars reticulata, GPe: globus pallidus external segment, STN: subthalamic nucleus.
References:
Nakamura K, Matsumoto M, Hikosaka O (2008) Reward-dependent modulation of neuronal activity in the primate dorsal raphe nucleus. J Neurosci 28:5331-5343.
Hikosaka O (2007) Basal Ganglia Mechanisms of Reward-oriented Eye Movement. Ann N Y Acad Sci 1104:229-249.
Matsumoto M, Hikosaka O (2007) Lateral habenula as a source of negative reward signals in dopamine neurons. Nature 447:1111-1115.
Isoda M, Hikosaka O (2007) Switching from automatic to controlled action by monkey medial frontal cortex. Nat Neurosci 10:240-248.
Hikosaka O, Nakamura K, Nakahara H (2006) Basal Ganglia orient eyes to reward. J Neurophysiol 95:567-584.
Nakamura K, Hikosaka O (2006) Role of dopamine in the primate caudate nucleus in reward modulation of saccades. J Neurosci 26:5360-5369.
Kawagoe R, Takikawa Y, Hikosaka O (2004) Reward-predicting activity of dopamine and caudate neurons - a possible mechanism of motivational control of saccadic eye movement. J Neurophysiol 91:1013-1024.
Lauwereyns J, Watanabe K, Coe B, Hikosaka O (2002) A neural correlate of response bias in monkey caudate nucleus. Nature 418:413-417.
Sato M, Hikosaka O (2002) Role of primate substantia nigra pars reticulata in reward-oriented saccadic eye movement. J Neurosci 22:2363-2373.
Hikosaka O, Nakahara H, Rand MK, Sakai K, Lu X, Nakamura K, Miyachi S, Doya K (1999) Parallel neural networks for learning sequential procedures. Trends Neurosci 22: 464-471.
Kawagoe R, Takikawa Y, Hikosaka O (1998) Expectation of reward modulates cognitive signals in the basal ganglia. Nat Neurosci 5: 411-416.
John O'Doherty
"Functional neuroimaging of human decision making: from simple choice to social interactions"
It is axiomatic that most animals including humans have a propensity to seek out rewards and avoid punishments. Central to the organization of such behavior is the ability to represent the value of rewarding and punishing stimuli, establish predictions of when and where such rewards and punishments will occur and use those predictions to form the basis of decisions that guide behavior. Interest in the computational and neural underpinnings of such learning processes has surged in recent years. This interest can be attributed in large part to the observation that the phasic activity of dopamine neurons bears a remarkable similarity to prediction error learning signals derived from a family of abstract computational models collectively known as reinforcement learning (RL). In RL, prediction error signals are used to update predictions of future reward for different actions. These values are then compared in order to implement action selection. In this presentation I will outline evidence from functional neuroimaging studies in humans for the existence of RL related signals in the human brain during both reward and punishment learning. Although standard RL models can account for a wide range of human and animal choice behavior, these models do have important limitations. One such limitation is a failure to account for higher order structure in a decision problem. In the latter part of the talk I will present behavioral and neural evidence for the existence of an additional computational mechanism in the brain that guides action selection under circumstances where higher order structure exists such as an interdependency between actions. This system appears to use knowledge of the abstract structure of the decision task in order to make choices and may exist co-operatively or competitively alongside standard RL. Finally, I will show how such higher-order mechanisms might also be employed in some forms of social interaction, when it is necessary to predict the thoughts and intentions of others in order to succeed in a competitive situation.
References:
Hampton, A.N., Bossaerts, P., O’Doherty, J.P., (2006) The Role of the Ventromedial Prefrontal Cortex in Abstract Sate-Based Inference during Decision making in Humans. Neurosciece 26(32):8360-8367.
Hampton, A. N., Bossaerts, P., O’Doherty, J.P. (2008) Neural correlates of mentalizing-related computations during strategic interactions in humans. PNAS 105:6741-6746.
O’Doherty, J.P., Hampton, A., Kim, H. (2007) Model-Based fMRI and Its Application to Reward Learning and Decision Making. Ann N Y Acad Sci 1104:35-53.