AI solves a version of Google’s 100 hat riddle

100 hat riddle

A team of Department of Computer Science academics has published a paper ‘Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks’ which has been reported on by the New Scientist and the Independent.

Working with Google DeepMind, the team of researchers created an AI which is able to solve a variation of the ‘100 hat riddle’, a puzzle used in Google job interviews. The research, which extends existing algorithms, could be developed to solve other multi-agent communication problems.

Nando de Freitas, a leading member of the research team explains, ‘We wanted to investigate how AI individuals with limited perception and reasoning can learn to communicate so as to solve very complex problems, so that the group as a whole benefits. Riddles and brainteasers are a fun and challenging environment for both humans and AI agents.

We observed that the AI agents learned communication protocols, and also learned to use objects in the environment to communicate – e.g. they learned to use the light switch in one of the riddles to signal to other prisoners that they had already been in the interrogation room, and hence in the end they all tasted sweet freedom!

This research is interesting to us for several scientific and engineering reasons. First, the behaviour of the AI agents mirrors that of some animals like vervet monkeys and honey bees, except that in our case communication is learned rapidly instead of via a slow natural selection process. Second, this paper provides engineers with guidance on how to build simple and cheap machines that could learn to act as a group so as to solve difficult problems – e.g. self-driving cars with limited perception that learn how to communicate so as to improve traffic flow and safety, distributed medical sensors that learn how to communicate so as to discover new treatments by making sense of the global picture. Third, it is encouraging that the same methods – deep neural networks and reinforcement learning – can be used to solve riddles, play Atari or reach human level performance in Go; this re-assures us that we are on the right scientific track to understand all aspects of intelligence.’

Team member Jakob Foerster comments, ‘The results show that we can reformulate tasks, which are made to be challenging for humans, as communication based AI problems and that our extension of existing algorithms can successfully solve these kind of challenges. For example, the AIs mange to discover the known optimal strategy for the hats riddle and invent novel solutions for the lightbulb riddle. This research also illustrates that AI research is a lot of fun and it’s great to see the interest from the broader public!’

Yannis Assael concludes, ‘Our work paves the way towards a new class of multi-agent communication problems. We believe that the challenge now is to gain deeper understanding of the communication protocols that arise and to see what other real applications we can solve with this approach.’