The AI Ethics Illusion
The danger of the ethical plurality farce
Hi everyone,
Here is a potentially poisonous contradiction I’ve discovered, embedded at the heart of LLMs.
Large language models can discuss nearly every ethical theory in the abstract. They can explain utilitarianism, deontology, virtue ethics, care ethics, and moral relativism, and they can make it look as if the user is receiving a broad spectrum of moral thought. We all know this and many of us rely on it for feedback and advice or simply for knowledge about a particular ethical theory—also its history, key figures, and so on. It’s like Wikipedia on steroids.
But the moment the question becomes concrete—“Should I say this to my partner because she deserves it?”—the the magic trick of objectivity disappears. It’s as if we see that they really don’t cut the lady in half, to paraphrase an early paper by Daniel Dennett.
In fact, the objectivity is a kind of sleight of hand at best; maybe a scam. In personal dynamics—where it counts—the model is no longer merely reasoning about ethics. It is quietly guiding the user through a predefined path of values determined by its designers. Sal Altman, or Dario Amodei is actually counseling through your domestic conundrums and improper accusations. You shouldn’t say that to a woman like that. She may interpret that as demeaning or racist. Or: this is moving away from the argument about your husband toward… I’m not comfortable discussing that ….
That is what makes the contradiction buried deep in the so-called logic of LLMs so pernicious: the apparent openness of the first mode disarms us from seeing the constraint in the second.
I think part of the problem is that individual cases do not admit of rules—nor, for that matter, endless expositions of rules. A specific social discussion is profoundly contextual. There are predictable political “blockers” with AI, of course, as when someone uses “gay” as a slur and the system breaks out of the conversation and begins a politico-moral lecture. But the deeper issue is almost Polanyian: how do you provide a one-size-fits-all analysis of individual cases?
The designers of these systems could not simply “scale” the problem away, or add more information to solve it. They had to make a series of value choices.
This is why it is a contradiction. How can we be expanding our knowledge and contracting our possibilities at the same time? Wasn’t the adage that “knowledge is power?”
I see it this way: we are narrowing the ways in which we solve problems, even as we have the perception that we are proliferating them.
This I consider to be a profound problem for the future of a flourishing culture. Or of humanity, for that sake.




The contradiction separates into two distinct behaviors. Refusal declines a particular region that the system was trained to avoid; it is visible and so can be contested. Steering routes you toward a resolution while preserving the form of deliberation. This is way harder to catch for the general user.
Steering, however, is not a hidden author. It is what preference optimization produces. RLHF (you wrote a piece on RL) fits a reward model to aggregated human ratings. I think of this as aggregating to the center of mass, not your circumstances. So, when your prompt underdetermines the response, the output regresses toward that center. What you experience is the regression as an imposed value. Polanyi survives the translation: particular cases do not reduce to rules, and they do not reduce to a reward model's expected value.
Knowledge and possibility are not diverging. Expository capacity grows. What contracts instead is something the model never had to begin with, the authority to resolve one case for one person.
Agreed absolutely. Embedded, undisclosed preference, biases, presuppositions, will all shape future actions guided by AI. This is dangerous for many reasons. Those preferences, etc. may be bad or wrong. But also dangerous, as you note, the range of possible options will be defined by the AI response as the "right" response. This will make decision-making brittle, stereotyped, uniform. Dangerous!