In 1975, Noam Chomsky and Jean Paiget held a historic debate about the nature of human cognition; Chomsky held that babies are born with a bunch of in-built rules and instincts that help them build up the knowledge that they need to navigate the world; Piaget argued that babies are effectively blank slates that acquire knowledge from experiencing the world (including the knowledge that there is a thing called “experience” and “the world”).
For most of AI’s history, Chomsky’s approach prevailed: computer scientists painstakingly tried to equip computers with a baseline of knowledge about the relationships between things in the world, hoping that computers would some day build up from this base to construct complex, powerful reasoning systems.
The current machine learning revolution can be traced to a jettisoning of this approach in favor of a Piaget-style blank slate, where layers of neural nets are trained on massive corpuses of data (sometimes labeled by hand, but often completely blank) and use equally massive computation to make sense of the data, creating their own understanding of the world.
Piaget-style deep learning has taken AI a long way in a short time, but it’s hitting a wall. It’s not just the weird and vastly entertaining local optima that these systems get stuck in: it’s the huge corpuses of data needed to train them and the inability of machine learning to generalize one model to bootstrap another and another.
The fall-off the rate of progress in machine learning, combined with the excitement that ML’s recent gains provoked, has breathed new life into the Chomskyian approach to ML, and computer scientists all over the world are trying to create “common sense” corpuses of knowledge that they can imbue machine learning systems with before they are exposed to training data.
This approach seems to be hurdling some of the walls that stopped the Piaget-style ML. Some Chomskyian ML models attained a high degree of efficiency with much smaller training data sets.
Frequent Boing Boing contributor Clive Thompson’s long piece on the state of the Chomsky/Piaget debate in ML is an excellent read, and really comes to the (retrospectively) obvious conclusion: it doesn’t really matter whether Chomsky or Piaget are right about how kids learn, because each of them is right about how computers learn — a little from Column A, a little from Column B.
But a bit of hand coding could be how you replicate some of the built-in knowledge that, according to the Chomskyite view, human brains possess. That’s what Dileep George and the Vicarious researchers did with Breakout. To create an AI that wouldn’t get stumped by changes to the layout of the game, they abandoned deep learning and built a system that included hard-coded basic assumptions. Without too much trouble, George tells me, their AI learned “that there are objects, and there are interactions between objects, and that the motion of one object can be causally explained between the object and something else.”As it played Breakout, the system developed the ability to weigh different courses of action and their likely outcomes. This worked in reverse too. If the AI wanted to break a block in the far left corner of the screen, it reasoned to put the paddle in the far right corner. Crucially, this meant that when Vicarious changed the layout of the game—adding new bricks or raising the paddle—the system compensated. It appeared to have extracted some general understanding about Breakout itself.
Granted, there are trade-offs in this type of AI engineering. It’s arguably more painstaking to craft and takes careful planning to figure out precisely what foreordained logic to feed into the system. It’s also hard to strike the right balance of speed and accuracy when designing a new system. George says he looks for the minimum set of data “to put into the model so it can learn quickly.” The fewer assumptions you need, the more efficiently the machine will make decisions. Once you’ve trained a deep-learning model to recognize cats, you can show it a Russian blue it has never seen and it renders the verdict—it’s a cat!—almost instantaneously. Having processed millions of photos, it knows not only what makes a cat a cat but also the fastest way to identify one. In contrast, Vicarious’ style of AI is slower, because it’s actively making logical inferences as it goes.
How to Teach Artificial Intelligence Some Common Sense [Clive Thompson/Wired]