AI pioneers who channeled 'hedonistic' machines win computer science's top prize

Teaching machines in the way that animal trainers mold the behavior of dogs or horses has been an important method for developing artificial intelligence and one that was recognized Wednesday with the top computer science award.

Two pioneers in the field of reinforcement learning, Andrew Barto and Richard Sutton, are the winners of this year's A.M. Turing Award, the tech world's equivalent of the Nobel Prize.

Research that Barto, 76, and Sutton, 67, began in the late 1970s paved the way for some of the past decade's AI breakthroughs. At the heart of their work was channeling so-called “hedonistic” machines that could continuously adapt their behavior in response to positive signals.

Reinforcement learning is what led a Google computer program to beat the world's best human players of the ancient Chinese board game Go in 2016 and 2017. It's also been a key technique in improving popular AI tools like ChatGPT, optimizing financial trading and helping a robotic hand solve a Rubik's Cube.

But Barto said the field was "not fashionable” when he and his doctoral student, Sutton, began crafting their theories and algorithms at the University of Massachusetts, Amherst.

“We were kind of in the wilderness,” Barto said in an interview with The Associated Press. “Which is why it’s so gratifying to receive this award, to see this becoming more recognized as something relevant and interesting. In the early days, it was not.”

Google sponsors the annual $1 million prize, which was announced Wednesday by the Association for Computing Machinery.

Barto, now retired from the University of Massachusetts, and Sutton, a longtime professor at Canada's University of Alberta, aren't the first AI pioneers to win the award named after British mathematician, codebreaker and early AI thinker Alan Turing. But their research has directly sought to answer Turing's 1947 call for a machine that “can learn from experience” — which Sutton describes as “arguably the essential idea of reinforcement learning.”

In particular, they borrowed from ideas in psychology and neuroscience about the way that pleasure-seeking neurons respond to rewards or punishment. In one landmark paper published in the early 1980s, Barto and Sutton set their new approach on a specific task in a simulated world: balance a pole on a moving cart to keep it from falling. The two computer scientists later co-authored a widely used textbook on reinforcement learning.

“The tools they developed remain a central pillar of the AI boom and have rendered major advances, attracted legions of young researchers, and driven billions of dollars in investments,” said Google’s chief scientist Jeff Dean in a written statement.

In a joint interview with the AP, Barto and Sutton didn't always agree on how to evaluate the risks of AI agents that are constantly seeking to improve themselves. They also distinguished their work from the branch of generative AI technology that is currently in fashion — the large language models behind chatbots made by OpenAI, Google and other tech giants that mimic human writing and other media.

“The big choice is, do you try to learn from people’s data, or do you try to learn from an (AI) agent’s own life and its own experience?” Sutton said.

Sutton has dismissed what he describes as overblown concerns about AI's threat to humanity, while Barto disagreed and said “You have to be cognizant of potential unexpected consequences.”

Barto, retired for 14 years, describes himself as a Luddite, while Sutton is embracing a future he expects to have beings of greater intelligence than current humans — an idea sometimes known as posthumanism.

“People are machines. They’re amazing, wonderful machines,” but they are also not the “end product” and could work better, Sutton said.

“It’s intrinsically a part of the AI enterprise,” Sutton said. “We’re trying to understand ourselves and, of course, to make things that can work even better. Maybe to become such things.”

03/05/2025 07:12 -0500

News, Photo and Web Search