Rxivist logo

Imitation as a model-free process in human reinforcement learning

By Anis Najar, Emmanuelle Bonnet, Bahador Bahrami, Stefano Palminteri

Posted 08 Oct 2019
bioRxiv DOI: 10.1101/797407

While there is not doubt that social signals affect human reinforcement learning, there is still no consensus about their exact computational implementation. To address this issue, we compared three hypotheses about the algorithmic implementation of imitation in human reinforcement learning. A first hypothesis, decision biasing, postulates that imitation consists in transiently biasing the learner's action selection without affecting her value function. According to the second hypothesis, model-based imitation, the learner infers the demonstrator's value function through inverse reinforcement learning and uses it for action selection. Finally, according to the third hypothesis, value shaping, demonstrator's actions directly affect the learner's value function. We tested these three psychologically plausible hypotheses in two separate experiments (N = 24 and N = 44) featuring a new variant of a social reinforcement learning task, where we manipulated the quantity and the quality of the demonstrator's choices. We show through model comparison that value shaping is favored, which provides a new perspective on how imitation is integrated into human reinforcement learning.

Download data

  • Downloaded 446 times
  • Download rankings, all-time:
    • Site-wide: 47,809 out of 114,303
    • In neuroscience: 7,419 out of 18,529
  • Year to date:
    • Site-wide: 38,586 out of 114,303
  • Since beginning of last month:
    • Site-wide: 38,272 out of 114,303

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)