Another Paradox for AI
The decades long quest for AI keeps surprising us... and proving the critics right.
Hi everyone,
I have been ruminating on transformers, self-attention, and deep neural networks for the last year and a half, and I’m clearly not alone. AI has gone mainstream today, and seems somehow to have boringly insinuated itself everywhere more than proved its supporters’ more aggressive claims about a coming machine intelligence.
We’re in an era of confusion about AI—not nearing its culmination. I still hear from readers worried over concerns like “alignment”—that AI will in some hard to define sense drift autonomously from our programming and best laid plans and nefariously turn on us. It’s not an irrational concern. But most of the harm envisioned from alignment mishaps are of an existential variety: AI decides it’s time for humans to go, and pushes the nuclear button or releases some gain of function virus, in the name of a higher good it divined, somehow, that saw no future role for humans. Real misalignment is less sexy: generative (statistical) systems making up facts and dates and refund policies, and confidently screwing up logic and math problems. We have misalignment today, but it’s annoying and embarrassing, not terrifying. The machines don’t wake up, rub their hands together and decide to take over the world. They just keep spitting out whatever’s the highest probability answer. They’re misaligned because they make mistakes, mindlessly. That’s a different sense of “misaligned” we now know is closer to reality about AI.
Other worries have a different valence today as well. In the salad days of Gibson or Stephenson or Philip K. Dick or Vernor Vinge, jobless futures were dystopian visions of super smart machines dominating and transforming human culture. Dark visions. Robot overlords. Humans left behind, playing renegade roles against increasingly insurmountable odds. These types of worries about AI are part and parcel of the field, and I would hate to see this genre and these discussions disappear. But let’s be honest, the arrival of powerful AI isn’t a gee-wiz sci-fi discussion or novel. Microsoft made “copilot” from advanced AI, and now we can use Bing with a conversational front-end. Alert the media (they did). Customer service chat is slightly more realistic, but equally if not more annoying. It now takes longer to reach a person, and you can have more lifelike discussions with a machine, still getting the run around. Economists like Tyler Cowen feel AI will supercharge our productivity—revolutionary stuff—but there’s no evidence of it yet. We seem to be filling our world with a bunch of automated systems that streamline our lives in one way and screw it up in another. That’s pretty much what we know for sure.
The future has arrived. The end result of all this discussion about AI is simply to say “oh, I guess machine intelligence isn’t real intelligence. What now?”
The Control (Alignment) Problem, Demystified
We already have an alignment problem with AI. Every time a foreign bad actor uses a language model to mindlessly generate dis and misinformation, “AI” is misaligned with our interests (but not the foreign actors). It’s not—and never was—a scientific question about the mystical powers of “AI.” It’s a question about how mechanized systems can be put to use by humans, who either wish for nefarious outcomes or don’t see them coming. When customer service LLMs invent refund policies, the human stakeholders finetune those models, or change the context and rules within which such systems function. We need to take ownership of our technological systems, “AI” or not. Want to avoid a transformer-based deep neural network system starting World War III? Circumscribe and oversee its use, as we do with all technologies. Sci-fi novelists can still write sci-fi and techno-futurists can still give Ted talks about getting outsmarted by alien intelligence. The systems we have are the systems that matter.
What LLMs and other foundational models—I call this Modern AI—teach us is that we were, in a sense, indulging sci-fi ideas about future AI because “real” AI wasn’t a reality, and we had lots of wiggle room to conjure up existential risks and parlor trick results. We could envision AI with motivations and desires and even consciousness, and from there we could dream up all sorts of hellish outcomes and nightmares. As Sam Harris once put it, birds don’t understand what we’re up to when building high rise buildings, and so too might we stupidly fail to understand an advanced intelligent artifact like AI, or AGI. But, as always, “AGI” is always somewhere off in the future, even as any AI researcher in the 1970s—or 1990s, for that matter—would call Modern AI “AGI” (or close enough). AI today does get misaligned. Ask lawyers for Air Canada, who had to pay a customer a refund from a fictitious refund policy invented whole cloth from its LLM customer service agent. Customer service LLMs are constantly misaligned with human intent: they spit out, in addition to refund policies that don’t exist, fallacious facts and figures, and can even be prodded to concoct recipes for Molotov cocktails and bigger threats like dirty bombs, and to plan out cyber attacks. Where are all the doomsday prophets?
In other words, the world is full of misalignment, and that should have summoned the usual suspects, Nick Bostrom (of Superintelligence fame) and other techno-futurists like Eliezer Yudkowsky, the guru of “friendly AI” formerly with Berkeley’s Machine Intelligence Research Institute (MIRI). What I’m noticing is a decided ho-hum. I can tell you as an “AI” scientist (a catch all phrase for the media until very recently) working in the field for two decades that pretty much every task we worked on in natural language processing saw huge significant accuracy improvements with the transformer architecture applied to deep neural networks. The results were dramatic, when viewing AI as a subfield of computer science and reviewing the problems we’ve all been trying to solve for decades.
In the realm of the imagination, though, where “AI” really lives, it’s become—oddly—business as usual. AI is now mainstream and boring. So, I’m now declaring what I’ll eponymously call “Larson’s Paradox”: the very success of AI tends to dispel the sci-fi visions about it.
Sidebar: In the Annals of AI
Moravec’s Paradox
The Moravec paradox states the tasks that are easy for humans, and difficult for machines, such as facial recognition.
The Frame Problem
The Frame Problem arises from the difficulty of an AI system in determining what is relevant in a given situation. When an AI receives new information, it struggles to update its beliefs without considering an impractically large number of irrelevant possibilities.
The Qualification Problem
The Qualification Problem refers to the difficulty in specifying all the preconditions necessary for an AI to successfully perform an action. It is challenging to account for all possible exceptions and contingencies in a rule-based system.
The Ramification Problem
The Ramification Problem occurs when an action taken by an AI has unintended consequences that must be accounted for. These side effects are not directly specified in the initial conditions or actions but are a natural consequence of the world model. (Read: why we don’t have Level Five self-driving cars, and won’t, for a very long time.)
Larson’s Paradox and the AI Effect
True, some version of Larson’s Paradox has been haunting discussions about AI for decades. The late John McCarthy, an original member of the famed Dartmouth Conference inaugurating the field of AI in 1956, once described what’s now known as the “AI Effect,” which suggests that as soon as AI achieves something, it is no longer considered AI because it becomes a routine technology. Luminaries like Ray Kurzweil and others have been lamenting the AI Effect for decades. Let them lament. The Larson Paradox is the other side of the coin: that success unravels the hype, and reveals “mechanical intelligence” to be, in effect, an oxymoron, an observation that starts deflating techno-futurist prognostications and drum beating about existential risk. Now that we can actually converse with an AI roughly in a manner that would have bedazzled pioneers like Alan Turing, we can see that the mechanical brains behind all the success are fugazi. AI systems are mindless, still, and quite clearly, it’s a mindlessness that we’ve engineered with massive amounts of data and compute power. Human brains learn similar tasks with ten thousand times less data. Clearly, “mechanical intelligence” is eating the world—and it’s not even, truly, intelligent.
Alas, we really do have an alignment problem (@samharris and co. can keep worrying), but we haven’t created a devilish creature that misreads our tea leaves and plots our demise, it’s just that we’ve created a system that’s now smart enough to mindlessly generate hogwash, and if we “jailbreak prompt” such systems, we might get it to jabber on (possibly effectively) about how to cyber attack the Pentagon. That’s an alignment problem! But the paradox I’m referring to is simply that progress in AI proves the critic, not the futurist: critics all along were insisting that machinery can’t really be intelligent, and they were right. It seems “intelligence” as we understand it requires some sort of organic mind that doesn’t reduce to arithmetical calculations or logic gates. That’s the implication of the Larson Paradox. If we mechanize intelligence, when we succeed, we’ll see it’s not really intelligence after all.
We can, endlessly, throw computer power at problems and simulate intelligent solutions to them. But we can’t get a mind from a mechanism. This was true back when we knew virtually nothing about AI, and it’s true today. It’s more true today, because we now have tangible examples of mindless intelligence, and it’s starting to dawn on us that human culture is what really counts, and we might want to invest in it—before it’s too late.
So. I declare this phenomenon Larson’s Paradox. The question we should all be asking is: what do we do now, given what we now know?
Thanks to everyone for making Colligo a success.
Erik J. Larson
Love the post! And “Larson’s paradox” is a dope name, I’ll for sure be using that one.
I'm a simple man.
I see AI, I see a socially acceptable method of throwing up your hands and tossing responsibility to your new silicon lord and master.
https://argomend.substack.com/p/the-church-of-ai
Alignment problems are the abdication of goal-setting to an "impartial, objective, and rational" entity and finding out that without human problems, it comes up with inhuman solutions.