What is the simplest model that can account for high-fidelity imitation? | Behavioral and Brain Sciences | Cambridge Core (original) (raw)

Abstract

What inductive biases must be incorporated into multi-agent artificial intelligence models to get them to capture high-fidelity imitation? We think very little is needed. In the right environments, both instrumental- and ritual-stance imitation can emerge from generic learning mechanisms operating on non-deliberative decision architectures. In this view, imitation emerges from trial-and-error learning and does not require explicit deliberation.

References

Bhoopchand, A., Brownfield, B., Collister, A., Lago, A. D., Edwards, A., Everett, R., … Zhang, L. M. (2022). Learning robust real-time cultural transmission without human data. arXiv preprint arXiv:2203.00715.Google Scholar

Borsa, D., Piot, B., Munos, R., & Pietquin, O. (2019). Observational learning by reinforcement learning. Proceedings of the 18th International Conference on Autonomous Agents and Multi-Agent Systems (pp. 1117–1124).Google Scholar

Catmur, C., Walsh, V., & Heyes, C. (2009). Associative sequence learning: The role of experience in the development of imitation and the mirror system. Philosophical Transactions of the Royal Society B: Biological Sciences, 364(1528), 2369–2380.CrossRefGoogle ScholarPubMed

Dolan, R. J., & Dayan, P. (2013). Goals and habits in the brain. Neuron, 80(2), 312–325.CrossRefGoogle Scholar

Ha, S., & Jeong, H. (2022). Social learning spontaneously emerges by searching optimal heuristics with deep reinforcement learning. arXiv preprint arXiv:2204.12371.Google Scholar

Haidt, J. (2001). The emotional dog and its rational tail: A social intuitionist approach to moral judgment. Psychological Review, 108(4), 814.CrossRefGoogle ScholarPubMed

Heyes, C. (2016). Homo imitans? Seven reasons why imitation couldn't possibly be associative. Philosophical Transactions of the Royal Society B: Biological Sciences, 371(1686), 20150069.CrossRefGoogle ScholarPubMed

Jaderberg, M., Mnih, V., Czarnecki, W. M., Schaul, T., Leibo, J. Z., Silver, D., & Kavukcuoglu, K. (2016). Reinforcement learning with unsupervised auxiliary tasks. International Conference on Learning Representations (ICLR).Google Scholar

Köster, R., Hadfield-Menell, D., Everett, R., Weidinger, L., Hadfield, G. K., & Leibo, J. Z. (2022). Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents. Proceedings of the National Academy of Sciences, 119(3).CrossRefGoogle ScholarPubMed

Leibo, J. Z., Zambaldi, V., Lanctot, M., Marecki, J., & Graepel, T. (2017). Multi-agent reinforcement learning in sequential social dilemmas. Proceedings of the 16th Conference on Autonomous Agents and Multi-Agent Systems (pp. 464–473).Google Scholar

Mercier, H., & Sperber, D. (2017). The enigma of reason. Harvard University Press.Google Scholar

Ndousse, K. K., Eck, D., Levine, S., & Jaques, N. (2021). Emergent social learning via multi-agent reinforcement learning. International Conference on Machine Learning (pp. 7991–8004). PMLR.Google Scholar

Perolat, J., Leibo, J. Z., Zambaldi, V., Beattie, C., Tuyls, K., & Graepel, T. (2017). A multi-agent reinforcement learning model of common-pool resource appropriation. Advances in Neural Information Processing Systems, 30.Google Scholar

Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., … Silver, D. (2020). Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, 588(7839), 604–609.CrossRefGoogle ScholarPubMed

Vinitsky, E., Köster, R., Agapiou, J. P., Duéñez-Guzmán, E., Vezhnevets, A. S., & Leibo, J. Z. (2021). A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings. arXiv preprint arXiv:2106.09012.Google Scholar

Woodward, M., Finn, C., & Hausman, K. (2020). Learning to interactively learn and assist. Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 03, pp. 2535–2543).Google Scholar