An artificial intelligence pioneer has launched a non-profit dedicated to developing an “honest” AI that will spot rogue systems attempting to deceive humans.
Yoshua Bengio, a renowned computer scientist described as one of the “godfathers” of AI, will be president of LawZero, an organisation committed to the safe design of the cutting-edge technology that has sparked a $1tn (£740bn) arms race.
Starting with funding of approximately $30m and more than a dozen researchers, Bengio is developing a system called Scientist AI that will act as a guardrail against AI agents – which carry out tasks without human intervention – showing deceptive or self-preserving behaviour, such as trying to avoid being turned off.
Describing the current suite of AI agents as “actors” seeking to imitate humans and please users, he said the Scientist AI system would be more like a “psychologist” that can understand and predict bad behaviour.
“We want to build AIs that will be honest and not deceptive,” Bengio said.
He added: “It is theoretically possible to imagine machines that have no self, no goal for themselves, that are just pure knowledge machines – like a scientist who knows a lot of stuff.”
However, unlike current generative AI tools, Bengio’s system will not give definitive answers and will instead give probabilities for whether an answer is correct.
“It has a sense of humility that it isn’t sure about the answer,” he said.
Deployed alongside an AI agent, Bengio’s model would flag potentially harmful behaviour by an autonomous system – having gauged the probability of its actions causing harm.
Scientist AI will “predict the probability that an agent’s actions will lead to harm” and, if that probability is above a certain threshold, that agent’s proposed action will then be blocked.
LawZero’s initial backers include AI safety body the Future of Life Institute, Jaan Tallinn, a founding engineer of Skype, and Schmidt Sciences, a research body founded by former Google chief executive Eric Schmidt.
Bengio said the first step for LawZero would be demonstrating that the methodology behind the concept works – and then persuading companies or governments to support larger, more powerful versions. Open-source AI models, which are freely available to deploy and adapt, would be the starting point for training LawZero’s systems, Bengio added.