What is the simplest model that can account for high-fidelity imitation? | Behavioral and Brain Sciences | Cambridge Core (original) (raw)

Abstract

What inductive biases must be incorporated into multi-agent artificial intelligence models to get them to capture high-fidelity imitation? We think very little is needed. In the right environments, both instrumental- and ritual-stance imitation can emerge from generic learning mechanisms operating on non-deliberative decision architectures. In this view, imitation emerges from trial-and-error learning and does not require explicit deliberation.

References

Bhoopchand, A., Brownfield, B., Collister, A., Lago, A. D., Edwards, A., Everett, R., … Zhang, L. M. (2022). Learning robust real-time cultural transmission without human data. arXiv preprint arXiv:2203.00715.Google Scholar

Borsa, D., Piot, B., Munos, R., & Pietquin, O. (2019). Observational learning by reinforcement learning. Proceedings of the 18th International Conference on Autonomous Agents and Multi-Agent Systems (pp. 1117–1124).Google Scholar

Catmur, C., Walsh, V., & Heyes, C. (2009). Associative sequence learning: The role of experience in the development of imitation and the mirror system. Philosophical Transactions of the Royal Society B: Biological Sciences, 364(1528), 2369–2380.CrossRef Google Scholar PubMed

Dolan, R. J., & Dayan, P. (2013). Goals and habits in the brain. Neuron, 80(2), 312–325.CrossRef Google Scholar

Ha, S., & Jeong, H. (2022). Social learning spontaneously emerges by searching optimal heuristics with deep reinforcement learning. arXiv preprint arXiv:2204.12371.Google Scholar

Haidt, J. (2001). The emotional dog and its rational tail: A social intuitionist approach to moral judgment. Psychological Review, 108(4), 814.CrossRef Google Scholar PubMed

Heyes, C. (2016). Homo imitans? Seven reasons why imitation couldn't possibly be associative. Philosophical Transactions of the Royal Society B: Biological Sciences, 371(1686), 20150069.CrossRef Google Scholar PubMed

Jaderberg, M., Mnih, V., Czarnecki, W. M., Schaul, T., Leibo, J. Z., Silver, D., & Kavukcuoglu, K. (2016). Reinforcement learning with unsupervised auxiliary tasks. International Conference on Learning Representations (ICLR).Google Scholar

Köster, R., Hadfield-Menell, D., Everett, R., Weidinger, L., Hadfield, G. K., & Leibo, J. Z. (2022). Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents. Proceedings of the National Academy of Sciences, 119(3).CrossRef Google Scholar PubMed

Leibo, J. Z., Zambaldi, V., Lanctot, M., Marecki, J., & Graepel, T. (2017). Multi-agent reinforcement learning in sequential social dilemmas. Proceedings of the 16th Conference on Autonomous Agents and Multi-Agent Systems (pp. 464–473).Google Scholar

Mercier, H., & Sperber, D. (2017). The enigma of reason. Harvard University Press.Google Scholar

Ndousse, K. K., Eck, D., Levine, S., & Jaques, N. (2021). Emergent social learning via multi-agent reinforcement learning. International Conference on Machine Learning (pp. 7991–8004). PMLR.Google Scholar

Perolat, J., Leibo, J. Z., Zambaldi, V., Beattie, C., Tuyls, K., & Graepel, T. (2017). A multi-agent reinforcement learning model of common-pool resource appropriation. Advances in Neural Information Processing Systems, 30.Google Scholar

Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., … Silver, D. (2020). Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, 588(7839), 604–609.CrossRef Google Scholar PubMed

Vinitsky, E., Köster, R., Agapiou, J. P., Duéñez-Guzmán, E., Vezhnevets, A. S., & Leibo, J. Z. (2021). A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings. arXiv preprint arXiv:2106.09012.Google Scholar

Woodward, M., Finn, C., & Hausman, K. (2020). Learning to interactively learn and assist. Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 03, pp. 2535–2543).Google Scholar