
Almost 2,000 years before ChatGPT was invented, two men had a debate that can teach us a lot about AI’s future. Their names were Eliezer and Yoshua.
No, I’m not talking about Eliezer Yudkowsky, who recently published a bestselling book claiming that AI is going to kill everyone, or Yoshua Bengio, the “godfather of AI” and most cited living scientist in the world — though I did discuss the 2,000-year-old debate with both of them. I’m talking about Rabbi Eliezer and Rabbi Yoshua, two ancient sages from the first century.
According to a famous story in the Talmud, the central text of Jewish law, Rabbi Eliezer was adamant that he was right about a certain legal question, but the other sages disagreed. So Rabbi Eliezer performed a bunch of miraculous feats intended to prove that God was on his side. He made a carob tree uproot itself and scurry away. He made a stream run backward. He made the walls of the study hall begin to cave in. Finally, he declared: If I’m right, a voice from the heavens will prove it!
What do you know? A heavenly voice came booming down to announce that Rabbi Eliezer was right. Still, the sages were unimpressed. Rabbi Yoshua insisted: “The Torah is not in heaven!” In other words, when it comes to the law, it doesn’t matter what any divine voice says — only what humans decide. Since a majority of sages disagreed with Rabbi Eliezer, he was overruled.
Key takeaways
- Experts talk about aligning AI with human values. But “solving alignment” doesn’t mean much if it yields AI that leads to the loss of human agency.
- True alignment would require grappling not just with technical problems, but with a major philosophical problem: Having the agency to make choices is a big part of how we create meaning, so building an AI that decides everything for us may rob us of the meaning of life.
- Philosopher of religion John Hicks spoke about “epistemic distance,” the idea that God intentionally stays out of human affairs to a degree, so that we can be free to develop our own agency. Perhaps the same should hold true for an AI.
Fast-forward 2,000 years and we’re having essentially the same debate — just replace “divine voice” with “AI god.”
Today, the AI industry’s biggest players aren’t just trying to build a helpful chatbot, but a “superintelligence” that is vastly smarter than humans and unimaginably powerful. This shifts the goalposts from building a handy tool to building a god. When OpenAI CEO Sam Altman says he’s making “magic intelligence in the sky,” he doesn’t just have in mind ChatGPT as we know it today; he envisions “nearly-limitless intelligence” that can achieve “the discovery of all of physics” and then some. Some AI researchers hypothesize that superintelligence would end up making major decisions for humans — either acting autonomously or through humans that feel compelled to defer to its superior judgment.
As we work toward superintelligence, AI companies acknowledge, we’ll need to solve the “alignment problem” — how to get AI systems to reliably do what humans really want them to do, or align them with human values. But their commitment to solving that problem occludes a bigger issue.
Yes, we want companies to stop AIs from acting in harmful, biased, or deceitful ways. But treating alignment as a technical problem isn’t enough, especially as the industry’s ambition shifts to building a god. That ambition requires us to ask: Even if we can somehow build an all-knowing, supremely powerful machine, and even if we can somehow align it with moral values so that it’s also deeply good… should we? Or is it just a bad idea to build an AI god — no matter how perfectly aligned it is on the technical level — because it would squeeze out space for human choice and thus render human life meaningless?
I asked Eliezer Yudkowsky and Yoshua Bengio whether they agree with their ancient namesakes. But before I tell you whether they think an AI god is desirable, we need to talk about a more basic question: Is it even possible?
Can you align superintelligent AI with human values?
God is supposed to be good — everyone knows that. But how do we make an AI good? That, nobody knows.
Early attempts at solving the alignment problem have been painfully simplistic. Companies like OpenAI and Anthropic tried to make their chatbots helpful and harmless, but didn’t flesh out exactly what that’s supposed to look like. Is it “helpful” or “harmful” for a chatbot to, say, engage in endless hours of romantic roleplay with a user? To facilitate cheating on schoolwork? To offer free, but dubious, therapy and ethical advice?
Most AI engineers are not trained in moral philosophy, and they didn’t understand how little they understood it. So they gave their chatbots only the most superficial sense of ethics — and soon, problems abounded, from bias and discrimination to tragic suicides.
But the truth is, there’s no one clear understanding of the good, even among experts in ethics. Morality is notoriously contested: Philosophers have come up with many different moral theories, and despite arguing over them for millennia, there’s still no consensus about which (if any) is the “right” one.
Even if all of humanity magically agreed on the same moral theory, we’d still be stuck with a problem, because our view of what’s moral shifts over time, and sometimes it’s actually good to break the rules. For example, we generally think it’s right to follow society’s laws, but when Rosa Parks illegally refused to give up her bus seat to a white passenger in 1955, it helped galvanize the civil rights movement — and we consider her action admirable. Context matters.
Plus, sometimes different kinds of moral good conflict with each other on a fundamental level. Think of a woman who faces a trade-off: She wants to become a nun but also wants to become a mother. What’s the better decision? We can’t say, because the options are incommensurable. There’s no single yardstick by which to measure them so we can’t compare them to find out which is greater.
“Probably we are creating an AI that will systematically fall silent. But that’s what we want.”
Thankfully, some AI researchers are realizing that they have to give AIs a more complex, pluralistic picture of ethics — one that recognizes that humans have many values and our values are often in tension with each other.
Some of the most sophisticated work on this is coming out of the Meaning Alignment Institute, which researches how to align AI with what people value. When I asked co-lead Joe Edelman if he thinks aligning superintelligent AI with human values is possible, he didn’t hesitate.
“Yes,” he answered. But he added that an important part of that is training the AI to say “I don’t know” in certain cases.
“If you’re allowed to train the AI to do that, things get much easier, because in contentious situations, or situations of real moral confusion, you don’t have to have an answer,” Edelman said.
He cited the contemporary philosopher Ruth Chang, who has written about “hard choices” — choices that are genuinely hard because no best option exists, like the case of the woman who wants to become a nun but also wants to become a mother. When you face competing, incomparable goods like these, you can’t “discover” which one is objectively best — you just have to choose which one you want to put your human agency behind.
“If you get [the AI] to know which are the hard choices, then you’ve taught it something about morality,” Edelman said. “So, that counts as alignment, right?”
Well, to a degree. It’s definitely better than an AI that doesn’t understand there are choices where no best option exists. But so many of the most important moral choices involve values that are on a par. If we create a carve-out for those choices, are we really solving alignment in any meaningful sense? Or are we just creating an AI that will systematically fall silent on all the important stuff?
“Probably we are creating an AI that will systematically fall silent,” Chang said when I put the question to her directly. “It’ll say ‘Red flag, red flag, it’s a hard choice — humans, you’ve got to have input!’ But that’s what we want.” The other possibility — empowering an AI to do a lot of our most important decision-making for us — strikes her as “a terrible idea.”
Contrast that with Yudkowsky. He’s the arch-doomer of the AI world, and he has probably never been accused of being too optimistic. Yet he’s actually surprisingly optimistic about alignment: He believes that aligning a superintelligence is possible in principle. He thinks it’s an engineering problem we currently have no idea how to solve — but he still thinks that, at bottom, it’s just an engineering problem. And once we solve it, we should put the superintelligence to broad use.
In his book, co-written with Nate Soares, he argues that we should be “augmenting humans to make them smarter” so they can figure out a better paradigm for building AI, one that would allow for true alignment. I asked him what he thinks would happen if we got enough super-smart and super-good people in a room and tasked them with building an aligned superintelligence.
“Probably we all live happily ever after,” Yudkowsky said.
In his ideal world, we would ask the people with augmented intelligence not to program their own values into an AI, but to build what Yudkowsky calls “coherent extrapolated volition” — an AI that can peer into every living human’s mind and extrapolate what we would want done if we knew everything the AI knew. (How would this work? Yudkowsky writes that the superintelligence could have “a complete readout of your brain-state” — which sounds an awful lot like hand-wavy magic.) It would then use this knowledge to basically run society for us.
I asked him if he’d be comfortable with this superintelligence making decisions with major moral consequences, like whether to drop a bomb. “I think I’m broadly okay with it,” Yudkowsky said, “if 80 percent of humanity would be 80 percent coherent with respect to what they would want if they knew everything the superintelligence knew.” In other words, if most of us are in favor of some action and we’re in favor of it fairly strongly and consistently, then the AI should do that action.
A major problem with that, however, is that it could lead to a “tyranny of the majority,” where perfectly legitimate minority views get squeezed out. That’s already a concern in modern democracies (though we’ve developed mechanisms that partially address it, like embedding fundamental rights in constitutions that majorities can’t easily override).
But an AI god would crank up the “tyranny of the majority” concern to the max, because it would potentially be making decisions for the entire global population, forevermore.
That’s the picture of the future presented by influential philosopher Nick Bostrom, who was himself pulling on a larger set of ideas from the transhumanist tradition. In his bestselling 2014 book, Superintelligence, he imagined “a machine superintelligence that will shape all of humanity’s future.” It could do everything from managing the economy to reshaping global politics to initiating an ongoing process of space colonization. Bostrom argued there would be advantages and disadvantages to that setup, but one glaring issue is that the superintelligence could determine the shape of all human lives everywhere, and could enjoy a permanent concentration of power. If you didn’t like its decisions, you would have no recourse, no escape. There would be nowhere left to run.
Obviously, if we build a system that’s practically omniscient and omnipotent and it runs our civilization, that would pose an unprecedented threat to human autonomy. Which forces us to ask…
Is an AI god desirable?
Yudkowsky grew up in the Orthodox Jewish world, so I figured he might know the Talmud story about Rabbi Eliezer and Rabbi Yoshua. And, sure enough, he remembered it perfectly as soon as I brought it up.
I noted that the point of the story is that even if you’ve got the most “aligned” superintelligent adviser ever — a literal voice from God! — you shouldn’t do whatever it tells you.
But Yudkowsky, true to his ancient namesake, made it clear that he wants a superintelligent AI. Once we figure out how to build it safely, he thinks we should absolutely build it, because it can help humanity resettle in another solar system before our sun dies and destroys our planet.
“There’s literally nothing else our species can bet on in terms of how we eventually end up colonizing the galaxies,” he told me.
Did he not worry about the point of the story — that preserving space for human agency is a crucial value, one we shouldn’t be willing to sacrifice? He did, a bit. But he suggested that if a superintelligent AI could determine, using coherent extrapolated volition, that a majority of us would want a certain lab in North Korea blown up, then it should go ahead and destroy the lab — perhaps without informing us at all. “Maybe the moral and ethical thing for a superintelligence to do is…to be the silent divine intervention so that none of us are faced with the choice of whether or not to listen to the whispers of this voice that knows better than us,” he said.
But not everyone wants an AI deciding for us how to manage our world. In fact, over 130,000 leading researchers and public figures recently signed a petition calling for a prohibition on the development of superintelligent AI. The American public is broadly against it, too. According to polling from the Future of Life Institute (FLI), 64 percent feel that it should not be developed until it is proven safe and controllable, or should never be developed. Previous polling has shown that a majority of voters want regulation to actively prevent superintelligent AI.
“Imagining an AI that figures everything out for us is like robbing us of the meaning of life.”
They worry about what could happen if the AI is misaligned (worst-case scenario: human extinction) but they also worry about what could happen even if the technical alignment problem is solved: militaries creating unprecedented surveillance and autonomous weapons; mass concentration of wealth and power in the hands of a few companies; mass unemployment; and the gradual replacement of human decision-making in all important areas.
As FLI’s executive director Anthony Aguirre put it to me, even if you’re not worried about AI presenting an existential risk, “there’s still an existentialist risk.” In other words, there’s still a risk to our identity as meaning-makers.
Chang, the philosopher who says it’s precisely through making hard choices that we become who we are, told me she’d never want to outsource the bulk of decision-making to AI, even if it is aligned. “All our skills and our sensitivity to values about what’s important will atrophy, because you’ve just got these machines doing it all,” she said. “We definitely don’t want that.”
Beyond the risk of atrophy, Edelman also sees a broader risk. “I feel like we’re all on Earth to kind of figure things out,” he said. “So imagining an AI that figures everything out for us is like robbing us of the meaning of life.”
It turned out this is an overriding concern for Yoshua Bengio, too. When I told him the Talmud story and asked him if he agreed with his namesake, he said, “Yeah, pretty much! Even if we had a god-like intelligence, it should not be the one deciding for us what we want.”
He added, “Human choices, human preferences, human values are not the result of just reason. It’s the result of our emotions, empathy, compassion. It is not an external truth. It is our truth. And so, even if there was a god-like intelligence, it could not decide for us what we want.”
I asked: What if we could build Yudkowsky’s “coherent extrapolated volition” into the AI?
Bengio shook his head. “I’m not willing to let go of that sovereignty,” he insisted. “It’s my human free will.”
His words reminded me of the English philosopher of religion John Hick, who developed the notion of “epistemic distance.” The idea is that God intentionally stays out of human affairs to a certain degree, because otherwise we humans wouldn’t be able to develop our own agency and moral character.
It’s an idea that sits well with the end of the Talmud story. Years after the big debate between Rabbi Eliezer and Rabbi Yoshua, we’re told, someone asked the Prophet Elijah how God reacted in that moment when Rabbi Yoshua refused to listen to the divine voice. Was God furious?
Just the opposite, the prophet explained: “The Holy One smiled and said: My children have triumphed over me; my children have triumphed over me.”
