Momentary reward induce changes in excitability of primary motor cortex

 

Abstract

Objective

To investigate the human primary motor cortex (M1) excitability changes induced by momentary reward.

Methods

To test the changes in excitatory and inhibitory functions of M1, motor-evoked potentials (MEPs), short-interval intracortical inhibition (SICI) and short-latency afferent inhibition (SAI) were tested in the abductor pollicis brevis (APB) muscle of non-dominant hand in 14 healthy volunteers by transcranial magnetic stimulation (TMS) during a behavioral task in which subjects were pseudorandomly received either reward target or non-target stimuli in response to a cue. To control sensorimotor and attention effects, a sensorimotor control task was done replacing the reward target with non-reward target.

Results

The SICI was increased, and the SAI was decreased significantly during the presentation of the reward target stimuli. Those changes were not evident during non-reward target stimuli in the sensorimotor control task, indicating that this change is specific to momentary reward.

Conclusions

Momentary rewarding is associated with change in intracortical inhibitory circuits of M1. Significance: TMS may be a useful probe to study the reward system in health and in many diseases in which its dysfunction is suspected.

Research highlights

► By using transcranial magnetic stimulation (TMS), we tested the primary motor cortex (M1) function during the processing of the momentary reward signals. ► We found the increased intracortical inhibition and decreased afferent inhibition of M1 in response to the momentary reward signals. ► Our findings suggest the existence of the reward-related function of human M1.

Keywords

Reward
Transcranial magnetic stimulation
Dopamine
Motor cortex excitability

1. Introduction

The word “reward” is socially linked to happiness or “hedonic process”, but is defined in affective neuroscience researches as an object or event that generates approach behavior, produces learning of such behavior, and is an outcome of decision making (Schultz, 2007). Consumption of reward either primary (e.g. palatable food or drinks, mating, or drugs) or secondary (e.g. money) produce hedonic experience which itself initiates a process of associative learning to consolidate behaviors and related cues (Arias-Carrion and Poppel, 2007). Animal and human lesion studies revealed specific brain structures implicated in reward processing, including the orbitofrontal, medial prefrontal regions, amygdalastriatum, and dopaminergic midbrain. These regions are highly interconnected to each other and can be considered as an integrated network (O’Doherty, 2004Wachter et al., 2009).

Integration of the reward into motor behavior occurs where reward-related neural signals meet circuits concerned with motor performance. The striatum receives inputs from various regions of the cerebral cortex, and parts of the thalamus. These excitatory glutamatergic inputs converge with dopamine inputs from the substantia nigra in the striatum. The output of the striatum influences other basal ganglia nuclei, which through direct and indirect pathways reach the thalamus. Finally, those projections go back to the frontal cortex including the primary motor cortex (M1). This anatomical organization provides a favorable substrate in the striatum for integrating dopaminergic reward signals with sensory cues and generating motor commands to motor areas (Wickens et al., 2003Schultz, 2004Ikemoto, 2007Hikosaka et al., 2008). Accordingly, reward-related signals might induce changes in the excitability of M1 which may be an important brain region to be studied in relation to the reward processing.

The midbrain dopaminergic system may have an important role in both analysis of the informational content of reward and also in control of reward-related behavior as a part of the reward network through its connections to other brain regions responsible to reward processing in the brain (Schultz et al., 2000Schultz, 2004Ikemoto, 2007Hikosaka et al., 2008). The former role is associated with the orbitofronatal, prefrontal, anterior cingulate corticeshippocampus, striatum and amygdala, while the latter is related to the striatum, nucleas accumbans and dorsal anterior cingulate area (Rolls, 2000Schultz, 2000Schultz et al., 2000O’Doherty et al., 2001Gottfried et al., 2003Kringelbach, 2005Oya et al., 2005Wise, 2005Murray, 2007Hikosaka et al., 2008Kapogiannis et al., 2008). Animal and human studies showed that the dopamine neurons in the midbrain are activated transiently in response to reward-predicting or rewarding stimuli (Schultz et al., 1993Schultz et al., 1997Koepp et al., 1998Schultz, 1998bSchultz, 2001Zald et al., 2004Zink et al., 2004Nakazato, 2005Heien and Wightman, 2006Schultz, 2007Natori et al., 2009).

Transcranial magnetic stimulation is a very useful tool to study the physiology of the central nervous system in humans. The amplitude of motor evoked potential(MEP) can be taken as a measure of the cortico-spinal excitability (Ziemann et al., 1996b). Short-interval intracortical inhibition (SICI) refers to MEP inhibition induced by conditioning TMS pulse applied to M1 (Kujirai et al., 1993) and is used to mainly study the activity of GABA-A inhibitory cortical neurons (Ziemann, 2004). Short-latency afferent inhibition (SAI) refers to MEP inhibition induced by a conditioning afferent electrical pulse applied to the peripheral nerve (Tokimura et al., 2000) and is partly related to the activity of cholinergicM1 receptors (Di Lazzaro et al., 2000) and is diminished by activation of certain GABA-A neuronal circuits (Di Lazzaro et al., 2005aDi Lazzaro et al., 2007). Previous studies showed the changes in M1 excitability in response to the reward prediction (Kapogiannis et al., 2008) and the urge to obtain a rewarding stimulus (Gupta and Aron, 2011), indicating that TMS measures can be used to address the reward-related M1 function.

However, there have been no researches which examined the modulatory effects of momentary reward itself on M1 excitability. To test this hypothesis, we investigated the excitatory and inhibitory system within human M1 by using transcranial magnetic stimulation (TMS) during the reward and sensorimotor control tasks.

2. Methods

2.1. Subjects

Experiments were performed on 14 healthy volunteers (eight males, and six females) aged 19–42 years (28.8 ± 7.6 (mean ± SD) years). Thirteen subjects were right-handed and one subject was left-handed determined by Oldfield handedness inventory (Oldfield, 1971). None of the subjects had a history of neurological or psychiatric disorders or was under drug treatment during experiments. Special care was taken that the subjects do not have a history of pathological gambling or addiction. All subjects gave written informed consent before experiments. The protocol was approved by the Ethics Committee of Kyoto University Graduate School of Medicine.

2.2. Recordings

Each subject was seated comfortably on an armchair with his or her arms placed on the armrest with the hands facing upwards. Surface electromyogram(EMG) was recorded from abductor pollicus brevis (APB) and abductor digiti minimi (ADM) muscles of the non-dominant “resting” hand, to avoid contamination of responses by voluntary EMG activity during task performance using pairs of silver electrodes. The recorded EMG was amplified, band-pass filtered (5–2000 Hz), digitized at a rate of 10 kHz and stored for later offline analysis. The subjects were instructed to keep relaxation of the left hand throughout the experiments with the aid of visual feedback from the online EMG monitor. Behavioral tasks were performed by the right hand.

2.3. TMS

Two Magstim 200 stimulators connected through Bistim unit (Magstim Company, Whitland, Dyfed, UK) were used for TMS which delivered to scalp surface through a figure-of-eight coil (9 cm for the outer diameter). The optimal motor point for eliciting the best MEP “hot spot” for the APB muscle of the non-dominant side was established by a suprathreshold stimulus over the M1 contralateral to the target muscle with the coil held ∼45° to the mid-sagittal line (approximately perpendicular to the central sulcus). This optimal position was marked on the scalp by a soft tip pen to ensure identical placement of the coil throughout the experiment. The direction of the induced current was from posterior to anterior.

The resting motor thresholds (rMT) for the relaxed APB muscle was determined to the nearest 1% of the stimulator output and defined as the lowest stimulus intensity required for eliciting MEP with peak to peak amplitude greater than 50 μV in at least five of ten trials (Rossini et al., 1994). The active motor threshold (aMT) was recorded as the minimum intensity at which MEPs with an amplitude of around 200 μV can be distinguished from the background activity in 50% of trials during slight isometric contraction of the target muscle (Rothwell et al., 1999).

To investigate the M1 excitability, the peak-to-peak amplitude of MEP was used. The stimulus intensity was set to the intensity that can produce MEP amplitude of approximately 1 mV in the APB was determined before the experiments (SI1mV).

To investigate the inhibitory system within M1, we used SICI and SAI. For the measurement of SICI, paired pulse magnetic stimuli were applied (Kujirai et al., 1993). The intensity of the conditioning stimulus was adjusted to 95% of aMT, and that of the test stimulus was adjusted to SI1mV with the interstimulus interval (ISI) of 3 ms (Ziemann et al., 1996c). The SICI was taken as the ratio of the mean conditioned MEP divided by the mean test MEP alone in the same block of trials.

For the measurement of SAI, the conditioning constant current square-wave electrical pulse of 0.2 ms duration was applied to the median nerve at wrist, with the cathode placed proximally, in the intensity of the motor threshold for evoking just visible muscle contraction in APB (Chen et al., 1999). The test stimulus was given at ISI of 20 ms after the conditioning pulse over the contralateral M1 (Tokimura et al., 2000). The SAI was taken as the ratio of the mean conditioned MEP divided by the mean test MEP alone in the same block of trials.

2.4. Experimental task

(Fig. 1) To measure the changes in M1 function, we designed the experiment so that a cue (four yellow squares) appears on a screen attached to a computer and placed in front of the subject. Only one of those four yellow squares contains the target stimulus. Subjects were instructed to select one of the yellow squares by pressing its corresponding button (buttons, 1, 2, 3, or 4) by the dominant hand. Each experiment contained two similar tasks; reward task and sensorimotor control task. Each task was composed of a total of 135 trials, 54 trials of them contained the target stimulus, and the remaining 81 trials contained a non-target stimulus (white circuit). For the target trial, the target stimulus was always presented to the subjects, irrespective of the button that they selected. The total trial duration was randomized between 7–8 s. Each trial started by presenting the cue (four squares) for 1 s at the maximum. As soon as the subject selected one of them by pressing the button, the cue disappears. Trials with reaction time (RT) slower than 1 s were excluded from the analysis. Two seconds after the onset of the cue, the target/non-target stimuli were presented for 2 s duration as a feedback for the subject.

Fig. 1. Experimental design in reward and sensorimotor control tasks. In the reward task, subjects received either reward target or non-target stimuli in a pseudorandom schedule as a feedback of the subjects’ response to the four squares cue. In the sensorimotor control task, non-reward target replaced the reward target. The duration of the cue presentation is equal to the RT of the subject in each individual trial. The trials’ duration was randomized between 7–8 s. Various TMS measures were done 1 s after the start of each stimulus which are presented for 2 s. (RT = reaction time, SOA = stimulus onset asynchrony from the start of one visual stimulus to that of another one, Sec = second, RT M1 = right primary motor cortex).

In the reward task, the target stimulus (reward target) was a picture of 100 Japanese yen coin which had a rewarding value as it represented an actual momentary money reward. In sensorimotor control task, the target stimulus (non-reward target) was a mauve circle containing asterisk sign (*), and this stimulus represents a mere right target selection without rewarding value, to control attention and other sensorimotor effects.

2.5. Experimental design

To measure the momentary effects of reward on M1 function, TMS measures were done after 1 s of the onset of the target/non-target stimuli. During the 54 target stimuli in each experiment, test stimulus alone (TS) to measure unconditioned MEP, SICI, and SAI were measured in 18 trials for each. The TS was always given after 1 s of the onset of visual stimuli. The same numbers of MEP, SICI and SAI were recorded for non-target stimuli. The order of the individual trials within each experiment, and the order of experiments itself were completely randomized. The experiments were designed by the Presentation program (Neurobehavioral Systems, Version 12.1).

2.6. Data analysis

For statistical analysis, repeated measures analyses of variance (ANOVA) were used. The factors tested in each experiment are given in more details in the results. The Greenhouse-Geisser method was used for adjustment of sphericity if needed in repeated-measures ANOVA. Two-tailed paired t test with Bonferroni correction was used for post hoc analysis. Effects were considered significant if P < 0.05. If not mentioned otherwise, all data are presented as mean ± standard error of mean (SEM).

3. Results

The mean RT ± SD was 406 ± 54 ms and 388 ± 53 ms for reward and sensorimotor control tasks, without any statistically significant difference (F = 1.629, P = 0.224). The mean ± SD of rMT and aMT of the APB muscle were 51 ± 12%, and 42 ± 7% of maximum stimulator output. The mean ± SD of the intensities of SI1mV and the conditioning pulse for SICI were 66 ± 12% and 40 ± 7% of maximal stimulator output. The percentage of delayed or inappropriate trials was 2.1% and 2.5% for reward and sensorimotor control tasks.

The MEP amplitude of both the APB and ADM muscles did not show any significant changes for target vs. non-target responses for both reward and sensorimotor control tasks (APB: 900 ± 77 vs. 939 ± 99 μV and 865 ± 83 vs. 919 ± 81 μV for reward and sensorimotor control tasks, and ADM: 654 ± 136 vs. 647 ± 103 μV and 697 ± 133 vs. 619 ± 122 μV for reward and sensorimotor control tasks) (Fig. 2Fig. 3). Repeated measures ANOVA with Task (reward and sensorimotor control) and Response (target and non-target) was insignificant for Experiment, Response and Experiment × Response interaction in both muscles.

Fig. 2EMG traces of a representative subject. Single traces of EMG in one representative subject recorded from the non-dominant APB muscle were shown during reward task (top), and sensorimotor control task (bottom).

Fig. 3. Effects of reward and sensorimotor control tasks on MEP in APB and ADM. Mean ± SEM for unconditioned MEP amplitudes in the APB and ADM muscles during target vs. non-target stimuli in reward and sensorimotor control tasks. In both tasks, there was no significant change in the unconditioned MEP amplitudes indicating no effect of reward target on MEP amplitude.

The conditioned MEP ratios for SICI were significantly smaller for the target responses in reward task, but not in sensorimotor control task (0.40 ± 0.05 vs. 0.53 ± 0.06 in control task and 0.50 ± 0.06 vs. 0.52 ± 0.05 in sensorimotor control task) (Fig. 2Fig. 4). Repeated-measures ANOVA for SICI ratio with Experiment and Response as within subject variables was significant for Experiment × Response interaction (F = 7.922, P = 0.015). The main effect of Response was significant (F = 16.820, P = 0.001). Post hoc analysis for the effect of Response was significant in reward (P = 0.002) but insignificant in sensorimotor control tasks.

Fig. 4. Effects of reward and sensorimotor control tasks on SICI in APB. Mean ± SEM for conditioned MEP amplitude ratios in the APB muscle for SICI during target vs. non-target stimuli. In reward task, there was a significant decrease in the conditioned MEP ratios induced by momentary reward target. In sensorimotor control task, there was no significant difference. (∗∗P < 0.01).

For SAI, the conditioned MEP ratios were significantly larger for the target responses in reward task (0.49 ± 0.05 vs. 0.39 ± 0.04 in control task and 0.39 ± 0.04 vs. 0.41 ± 0.05 in sensorimotor control task) (Fig. 2Fig. 5). Repeated-measures ANOVA for SAI ratio with Experiment and Response was significant for Experiment × Response interaction (F = 7.042, P = 0.02). Post hoc analysis for Response in each experiment revealed the significant effect for reward task (P = 0.024) but the insignificant effect for sensorimotor control task.

Fig. 5. Effects of reward and sensorimotor control tasks on SAI in APB. Mean ± SEM for the conditioned MEP amplitude ratio in the APB muscle for SAI during target vs. non-target stimuli. In reward task, there was a significant increase in the conditioned MEP ratios induced by momentary reward target. In sensorimotor control task, there was no significant difference. (P < 0.05).

7. Discussion

We found that the monetary reward task can modulate M1 excitability via inhibitory neural system within M1. There was significantly increased SICI and decreased SAI in response to the momentary reward. This change in M1 excitability was absent in the control study indicating that it can’t be explained by attention and other sensorimotor factors which are known to affect M1 excitability (Maunsell, 2004Kotb et al., 2005). The general parameters used for measuring the M1 excitability (MEP) showed no significant change in response to both reward and control tasks.

Animal studies showed that dopamine neurons in the midbrain, in addition to its tonic activity, show phasic activation in response to momentary rewards and reward-predicting stimuli (Schultz et al., 1993Schultz et al., 1997Schultz, 1998bSchultz, 2001Nakazato, 2005Heien and Wightman, 2006Schultz, 2007Natori et al., 2009). In humans, dopamine release in neural targets of the midbrain dopaminergic neurons, namely the striatum was detected in recent imaging studies in response to various primary and secondary rewarding stimuli (Koepp et al., 1998Zald et al., 2004Zink et al., 2004). Similar results were obtained in many fMRI studies (Breiter et al., 1997Breiter and Rosen, 1999Breiter et al., 2001Knutson et al., 2001O’Doherty et al., 2002Volkow et al., 2002bKirsch et al., 2003Tricomi et al., 2004Knutson and Cooper, 2005).

Substantial evidences indicate that dopamine neurons of the primate ventral midbrain code reward prediction error which is the discrepancy between the probability of reward and its actual occurrence (Schultz et al., 1993Schultz, 1998aWaelti et al., 2001Fiorillo et al., 2003) rather than the reward value itself. Accordingly, the phasic burst firing of dopamine neurons was found to be higher in response to unpredicted or under-predicted rewards (Schultz, 1998bSchultz and Dickinson, 2000). This phasic activation of the midbrain dopaminergic neurons causes the rise in the dopamine concentration of the basal ganglia. In primate animal studies, the rise reaches its peak around 1 s after the onset of the reward-related stimulus, and starts to decline after 2 s, reaching the baseline concentration after around 4 s (Schultz, 1998aSchultz, 2001Roitman et al., 2004Schultz, 2007). Taking this time course in consideration, we applied the TMS measures at the expected time of the peak dopamine concentration in the basal ganglia. In our study, since the reward magnitude and timing were held constant, the reward prediction error would have been related to the reward probability (P) and the actual outcome. Since the reward probability in our study was low, the activation of dopamine neurons might be substantial (Fiorillo et al., 2003).

The changes in SICI and SAI induced by momentary reward in our study were consistent with those induced by dopamine. Parkinson’s disease (PD) patients with the drug-off state (Ridding et al., 1995Strafella et al., 2000Bares et al., 2003) and cervical dystonia patients (Kanovsky et al., 2003) showed reduced SICI compared to normal subjects. Also, in patients with attention deficit hyperactivity disorder (ADHD), in which there is dysfunction in the dopamine reward pathway (Volkow et al., 2009), SICI was also reduced (Moll et al., 2000Richter et al., 2007Schneider et al., 2007). However, dopaminergic drugs significantly increase SICI in normal subjects (Ziemann et al., 1996aZiemann et al., 1997Korchounov et al., 2007), and in PD patients (Ridding et al., 1995Strafella et al., 2000Lefaucheur et al., 2004Bares et al., 2007). Moreover, methylphenidate which blocks dopamine reuptake into presynaptic nerve endings (Volkow et al., 2001Volkow et al., 2002a), also increased SICI in ADHD patients (Moll et al., 2000Buchmann et al., 2007). However, it is still possible that the change in SICI is related to the change in the aMT but not to the intracortical inhibition. Since it was not feasible to measure the aMT in an online way during the study, further studies would be necessary to clarify this point.

The striatum is centrally positioned in the functional network controlling motor and cognitive aspects of behavior (Graybiel et al., 1994Middleton and Strick, 1997Middleton and Strick, 2000). In addition to its role in reward processing, it is a well established recipient of M1 glutamatergic and midbrain dopaminergic inputs (Aosaki et al., 1994Kawaguchi et al., 1995Wickens et al., 2003Calabresi et al., 2007). Many animal studies showed that the motor areasincluding the M1 are connected to the pallidal output neurons through the thalamus (Nambu et al., 1988Tokuno et al., 1992Kayahara and Nakano, 1996).

Human studies showed that the thalamus may play a role in controlling intracortical inhibition in M1, as SICI was defective in a patient with complex movement disorder who suffered thalamic ischemic lesion (Münchau et al., 2002). On the other side, an opposite effect has been shown in epilepticpatients treated with thalamic deep brain stimulation (DBS) (Molnar et al., 2006). Those thalamic projections to M1 are under tonic inhibitory control from the pallidal output of basal ganglia (Groenewegen, 2003DeLong and Wichmann, 2007), which might explain the results of SICI in the present study.

Changes in SAI were found in many pharmacological and patient studies; SAI was found to be increased PD patients with off-medications (Di Lazzaro et al., 2004Nardone et al., 2005) and was significantly decreased in patients with on-medications suggesting that the dopaminergic treatment reduces SAI (Sailer et al., 2003). Moreover, SAI was also found to be decreased in diseases that are thought to be related to abnormalities in basal ganglia–thalamo–cortical circuits as dystonia (Di Lazzaro et al., 2009) and Gilles de la Tourette syndrome (GTS) patients, (Orth et al., 2005Orth, 2009). Accordingly, dopamine release in the striatum may affect SAI in M1 indirectly. Some studies showed that changes in SICI and SAI are inversely related (Di Lazzaro et al., 2005aDi Lazzaro et al., 2005bDi Lazzaro et al., 2007Alle et al., 2009) and recently, a model of two distinct reciprocally connected subtypes of GABA inhibitory interneurons with convergent projections onto the corticospinal neurons, in which SICI is dominant over SAI, was suggested to explain this inverse relationship (Alle et al., 2009). Our results are further supported by this inverse relation.

Unconditioned MEP amplitude showed no significant changes in response to dopaminergic treatment in PD patients (Ridding et al., 1995Dioszeghy et al., 1999) and in healthy subjects (Ziemann et al., 1996aZiemann et al., 1997), which are in agreement with the results of our study.

In addition to the above mentioned pathway, the effect of reward-related activity may affect the excitability of M1 through its connections to other secondary motor and non-motor brain regions which receive reward-related information, such as prefrontalorbitofrontalanterior cingulatesupplementary motor areasamygdala, and nucleus accumbens (Svensson et al., 1995Schultz et al., 2000Gottfried et al., 2003Williams et al., 2004).

We hereby conclude that the M1, as well as other frontal regions implicated in the reward processing, receives reward-related signals. Striatal dopamine may play an important role in reward-related motor learning (Wickens et al., 2003) and in induction of corticostriatal synaptic plasticity (Calabresi et al., 2007). In animal studies, dopamine either in M1 (Molina-Luna et al., 2009) and/or in the striatum (Centonze et al., 2003Davis et al., 2007) is essential for motor learning. Also in humans, dopamine is important not only for the development of M1 plasticity (Ueki et al., 2006), but also for its enhancement (Nitsche et al., 2006Lang et al., 2008Rodrigues et al., 2008Rizzo et al., 2009). The excitability changes in M1 induced by the momentary reward may be related to the reward-related motor activity at the cortical level or may reflect its occurrence at striatal level.

Acknowledgments

This study is partly supported by the Strategic Research Program for Brain Sciences (SRPBS) for TM from the MEXT of Japan, and Grant-in-Aid for Scientific Research (C) 21613003 for T.M. from Japan Society for the Promotion of Science