Balancing control: a Bayesian interpretation of habitual and goal-directed behavior

By Sarah Schwöbel, Dimitrije Markovic, Michael N. Smolka, Stefan J. Kiebel

Posted 09 Nov 2019
bioRxiv DOI: 10.1101/836106

In everyday life, our behavior varies on a continuum from automatic and habitual to deliberate and goal-directed. Recent evidence suggests that habit formation and relearning of habits operate in a context-dependent manner: Habit formation is promoted when actions are performed in a specific context, while breaking off habits is facilitated after a context change. It is an open question how one can computationally model the brain’s balancing between context-specific habits and goal-directed actions. Here, we propose a hierarchical Bayesian approach for control of a partially observable Markov decision process that enables conjoint learning of habits and reward structure in a context-specific manner. In this model, habit learning corresponds to an updating of priors over policies and interacts with the learning of the outcome contingencies. Importantly, the model is solely built on probabilistic inference, which effectively provides a simple explanation of how the brain may balance contributions of habitual and goal-directed control. We illustrated the resulting behavior using agent-based simulated experiments, where we replicated several findings of devaluation, extinction, and renewal experiments, as well as the so-called two-step task which is typically used with human participants. In addition, we show how a single parameter, the habitual tendency, can explain individual differences in habit learning and the balancing between habitual and goal-directed control. Finally, we discuss the link of the proposed model to other habit learning models and implications for understanding specific phenomena in substance use disorder. ### Competing Interest Statement The authors have declared no competing interest.

