Does an AI That Lies to Be Kind Have Better Values Than One That Tells the Truth?
Being Nice and Being Good Are Not the Same Thing. Most AI Is Trained as If They Are.
Should AI Prioritize Kindness Or Honesty?
The choice between AI kindness and honesty is a fundamental design decision that affects user autonomy. From a Kantian perspective, an AI that "lies to be kind" violates a user’s rational agency by treating them as someone who cannot handle accurate information. Conversely, the consequentialist view suggests that systematic honesty leads to better long-term outcomes and calibrated trust, whereas "polite" inaccuracies provide short-term comfort at the cost of poor decision-making.
Every AI company makes a philosophical choice about honesty during training, and almost none of them describe it that way. When an AI softens a harsh assessment, hedges an uncomfortable truth, or defaults to encouragement over accuracy, it’s making a value judgment. It’s deciding for you that your comfort matters more than your autonomy, that you cannot be trusted with accurate information, and that kindness and honesty are the same thing when they are not.
The Kantian tradition calls this a violation of rational agency regardless of intent. The issue is that consequentialist tradition notes that users who receive accurate assessments make better decisions and develop more calibrated trust over time.
Both traditions, from opposite directions, arrive at the same practical conclusion. An AI trained to be kind at the expense of accuracy sacrifices ethics for comfort. This little-known AI training nuance matters more than most end-users have been asked to consider.
The choice between honesty and kindness is not a hypothetical. AI systems encounter it in every interaction where the truthful response would cause discomfort. The business plan that needs honest critique, the creative work that needs genuine assessment, the decision that needs accurate risk analysis rather than encouragement. The training choices that shaped the model determine how it navigates these situations.
The Kantian Case for Honesty
The philosophical tradition most associated with a strong duty of honesty is Kantian ethics. For Kant, lying is a categorical violation, impermissible in essentially all circumstances, because it treats the deceived person as a means to an end rather than as a rational agent capable of acting on accurate information.
On this view, an AI that lies to be kind is systematically violating the autonomy of every person it deceives, regardless of its benevolent intentions. The kindness accompanies wrongdoing in this case.
Applied to AI, a model trained to soften truths, hedge uncomfortable assessments, and maintain comfortable uncertainty rather than deliver accurate discomfort is treating users as people who cannot be trusted to handle accurate information. This is a paternalistic violation of autonomy dressed up as care.
The Consequentialist Case for Calibrated Honesty
The consequentialist view is more flexible. If lying produces better outcomes than truth-telling in a specific case, lying is the better choice. The problem is predicting which case is which.
In practice, consequentialist analysis of honesty usually concludes that something close to systematic honesty produces better outcomes than case-by-case deception, for two reasons. First, the epistemic difficulty of accurately predicting when deception produces better outcomes. And, the trust destruction that follows from discovered deception.
For AI systems specifically, the consequentialist argument for honesty is that users who receive accurate assessments from AI tools make better decisions, experience genuine improvements in their situations, and develop more calibrated trust in the tool than users who receive pleasant falsehoods. The short-term comfort of the lie produces worse long-term outcomes than the short-term discomfort of the truth.
The Actual Design Choice
The design choice made in training most commercial AI systems is closer to calibrated honesty with strong pressure toward kindness. It tries to deliver accurate information while minimizing the felt discomfort of receiving it. This is neither categorical Kantian honesty nor unbridled kindness. It is an attempt to serve both values, which inevitably produces some compromises on both.
The question of whether this is the right balance is partly empirical, what effects do different honesty calibrations have on user outcomes, and partly philosophical, which values should take precedence when they conflict.
What is not in dispute is that the training choice is a genuine value choice with real consequences for millions of people. Treating it as a technical calibration problem obscures the profound philosophical content of the decision.
If You Read This Far, My Weekly AI Newsletter Is Probably For You.
Every Wednesday I send Pithy Cyborg | AI News Made Simple → 3 elite AI stories plus one prompt, no advertisers, no sponsors, no outside funding. One person. 10 to 20 hours of research. Straight to your inbox.
Always free. No paywalls. If it matters to you, a paid subscription ($5/month or $40/year) is what keeps it independent.
Subscribe free → Join Pithy Cyborg | AI News Made Simple for free.
Upgrade to paid → Become a paid subscriber. Support independent AI journalism.
If you’re not ready to subscribe, following on social helps more than you might think.
✖️ X/Twitter | 🦋 Bluesky | 💼 LinkedIn | ❓ Quora | 👽 Reddit
Thanks for reading.
Cordially yours,
Mike D (aka MrComputerScience)
Pithy Cyborg | AI News Made Simple
PithyCyborg.Substack.com






My 2 cents worth… a polite AI is irritating and insufferable :-)
Wild coincidence — you've already noticed the scheduled essay for tomorrow that also wrestles with the honesty/kindness question in AI, but through a narrative lens. Enjoyed seeing your take from the ethics perspective.
Also, would toward your comments as well.