These AI agents played millions of games of hide-and-seek to learn the best strategies for using different tools (ramps and barricades).
From OpenAI:
We’ve observed agents discovering progressively more complex tool suse while playing a simple game of hide-and-seek. Through training in our new simulated hide-and-seek environment, agents build a series of six distinct strategies and counterstrategies, some of which we did not know our environment supported. The self-supervised emergent complexity in this simple environment further suggests that multi-agent co-adaptation may one day produce extremely complex and intelligent behavior.