
Turing Award Winners Pioneered Reinforcement Learning Method
Teaching machines like animal trainers mold dogs or horses is an important method for developing artificial intelligence that was recognized with the top computer science award, the A.M. Turing Award.
The winners of this year's prestigious prize are Andrew Barto and Richard Sutton, pioneers in the field of reinforcement learning.
A Groundbreaking Approach to AI Development
Research that Barto, 76, and Sutton, 67, began in the late 1970s paved the way for some of the past decade's AI breakthroughs.
Their work was centered around creating so-called “hedonistic” machines that could continuously adapt their behavior in response to positive signals.
A Key Technique in AI Breakthroughs
Reinforcement learning led a Google computer program to beat the world's best human players of the ancient Chinese board game Go in 2016 and 2017.
This technique has also been key in improving popular AI tools like ChatGPT, optimizing financial trading, and helping a robotic hand solve a Rubik's Cube.
A Challenging Beginning
Barto said the field was "not fashionable” when he and Sutton began crafting their theories and algorithms at the University of Massachusetts, Amherst.
“We were kind of in the wilderness,” Barto said in an interview with The Associated Press. “Which is why it’s so gratifying to receive this award, to see this becoming more recognized as something relevant and interesting.”
A Pioneering Achievement
The A.M. Turing Award is sponsored by Google and comes with a $1 million prize.
Barto and Sutton aren't the first AI pioneers to win the award named after British mathematician, codebreaker, and early AI thinker Alan Turing.
Legacy of Reinforcement Learning
Their research has directly sought to answer Turing's 1947 call for a machine that “can learn from experience” — which Sutton describes as “arguably the essential idea of reinforcement learning.”
In particular, they borrowed from ideas in psychology and neuroscience about the way that pleasure-seeking neurons respond to rewards or punishment.
A Landmark Paper and Textbook
Barto and Sutton set their new approach on a specific task in a simulated world: balance a pole on a moving cart to keep it from falling.
They co-authored a widely used textbook on reinforcement learning, which remains a central pillar of the AI boom.
A Recognition of Their Work
Google's chief scientist Jeff Dean praised their work, saying that the tools they developed have rendered major advances and driven billions of dollars in investments.
Divergent Views on AI Risks
In a joint interview with the AP, Barto and Sutton didn't always agree on how to evaluate the risks of AI agents that are constantly seeking to improve themselves.
Barto expressed concerns about potential unexpected consequences, while Sutton dismissed what he describes as overblown concerns about AI's threat to humanity.
Future of Human Intelligence
Sutton expects a future with beings of greater intelligence than current humans — an idea sometimes known as posthumanism.
Barto describes himself as a Luddite, while Sutton is embracing this future and sees people as amazing, wonderful machines that could work better.