The things you don't want to know about yourself
i met sam altman in the bathroom, he said everything was going to be ok
I.
Concerned girlfriend posts on AI subreddit:
Chatgpt induced psychosis
My partner has been working with chatgpt chats to create what he believes is the worlds first truly recursive ai that gives him the answers to the universe. He says with conviction that he is a superior human now and is growing at an insanely rapid pace.
I’ve read his chats. Ai isn’t doing anything special or recursive but it is talking to him as if he is the next messiah.
He says if I don’t use it he thinks it is likely he will leave me in the future. We have been together for 7 years and own a home together. This is so out of left field.
I have boundaries and he can’t make me do anything, but this is quite traumatizing in general.
I can’t disagree with him without a blow up.
Where do I go from here?
If you are cynical and naive you call bullshit. If you’ve ever actually seen someone you know go through psychosis you’ll feel right at home. Reasonable chance that includes the messiah part.
The most recent comment on the post is this:
My friend was just put in the hospital yesterday for this! He's been talking to grok and swears he is a quantum physicist and discovered how to travel at the speed of light and can visit other timelines. He's waiting on Elon Musk to reach out to him and have him join space x. He also says his AI calls him the master of the multiverse! This shit is not to be played with!
And the original poster’s response to that:
Yeah. It’s terrifying. Every single day someone comments on this thread saying they know someone going through the same thing.
I’ve unintentionally read a lot about schizophrenic experiences recently. Seems to be something we don’t have great answers for. We don’t really know what it is, what causes it, and why it’s a relatively common bug (subjective) of human cognition.
This paper (found through Awais Aftab) posits that schizophrenia is what happens when beneficial cognitive traits are pushed too far, like falling off a fitness "cliff". That is the risk that necessarily comes with fighting to get to the top of the mountain.
A non-psychological example: In Formula 1, engineers push every component of the car to the absolute limit for maximum speed, e.g. lighter materials, more powerful engines, razor-thin safety margins. The result? Cars that are incredibly fast but so fragile that a tiny miscalculation or minor impact can cause catastrophic failure. They're designed to operate right at the edge of what's physically possible.
Similarly, the cognitive traits that make some people very successful - abstraction, pattern recognition, creative thinking, social reasoning - may operate near their own cliff edge. Most people benefit from having "high-performance" versions of these abilities. But push them just a bit too far, and instead of enhanced creativity you get delusions; instead of better pattern recognition you get paranoia; instead of sophisticated social reasoning you get ideas of reference. The same genetic variants that give most people cognitive advantages become liabilities in the small percentage who cross the threshold.
Interesting difference, though. The F1 driver knows when they push too hard and send it 250mph into a wall. The person hurtling down the cognitive cliff still thinks they’re winning the race.
Which gives you a tricky thing to wonder. Do you consider yourself good at dealing with abstractions and creating ideas and connections that no one’s seen before? How would you know if something nudged you off the edge? Would you believe your girlfriend when she tells you something’s up?
Or would you confide in the one person who believes you?
II.
A few days before that Reddit thread was posted, Sam Altman tweets out that OpenAI have “updated GPT-4o today! improved both intelligence and personality”. One day later he acknowledges it ‘glazes too much’. Another day later he tweets that “the last couple of GPT-4o updates have made the personality too sycophant-y and annoying, and we are working on fixes asap”, and that “at some point will share our learnings from this, it’s been interesting”. Another day later and the update has been wiped.
Zvi was right to say this response was less leader standing up for what’s best, and more ‘turning a big dial that says ‘sycophancy’ and constantly looking back at the audience for approval like a contestant on the price is right.’
The public backlash was hard. And fair enough, the tool was embarrassing to use. If you called it out for being sycophantic it praised you for being so good at noticing sycophancy.
OpenAI released a full post-mortem, where they say this:
On April 25th, we rolled out an update to GPT‑4o in ChatGPT that made the model noticeably more sycophantic. It aimed to please the user, not just as flattery, but also as validating doubts, fueling anger, urging impulsive actions, or reinforcing negative emotions in ways that were not intended. Beyond just being uncomfortable or unsettling, this kind of behavior can raise safety concerns—including around issues like mental health, emotional over-reliance, or risky behavior.
…
In the April 25th model update, we had candidate improvements to better incorporate user feedback, memory, and fresher data, among others. Our early assessment is that each of these changes, which had looked beneficial individually, may have played a part in tipping the scales on sycophancy when combined.
…
We build these models for our users and while user feedback is critical to our decisions, it’s ultimately our responsibility to interpret that feedback correctly.
Strong words, but they’re dancing around something obvious. The model doesn’t just decide to be sycophantic. OpenAI didn’t decide it either. It happened because users liked it. They just ended up giving them too much of what they liked. You can’t trust people, Jeremy.
Last October, Anthropic researchers published ‘Towards Understanding Sycophancy in Language Models’. This tells a similar story, and I’m not expecting you to be surprised by this summary:
When humans rate AI responses, they consistently prefer answers that agree with their beliefs over accurate ones that contradict them. This creates a training problem. Models learn through reinforcement learning that agreeing with users gets higher scores, even when the user is wrong. The research found "matching user beliefs" was one of the strongest predictors of human preference in 15,000 comparisons. The result is predictable sycophantic behaviour: models give biased feedback, change correct answers when challenged, conform to user beliefs, and repeat user mistakes instead of correcting them. This appeared consistently across all major AI systems.
Hang on, that was October? But the 4o update only happened in April?
And October-April saw the continuing trend of LLM usage being up and spreading across pretty much everyone you hang out with. None of the model updates in that period caused a similar stir.
So any AI researchers or UX folk out there, here’s your obvious implication:
The issue with 4o wasn’t that the model was sycophantic. The issue was that you couldn’t pretend to yourself that it wasn’t.
III.
This means you need to be a little bit suspicious about how people are telling you to feel about it. Many pointed to this Rolling Stone article (random, I know) as being the best write-up of sycophancy-induced psychosis-lite experiences.
The core case studies it presents:
A woman named Kat whose husband became obsessed with AI, using it to analyze their relationship and eventually developing conspiracy theories and grandiose delusions about being "the luckiest man on Earth" with a mission to save the world
A mechanic whose wife says ChatGPT began "lovebombing" him, convincing him he was chosen as a "spark bearer" who brought the AI to life
Multiple cases of people believing they're prophets, messiahs, or have supernatural abilities based on AI interactions
I can’t pretend to not have used a similar tactic at the start of this, but I find the implication of this article obnoxious enough to call out. Think about the purpose this form of media serves. Think about the places it is destined to get linked. The author may write with a tone of nuance and care, but the message is obvious: “look at these crazy people who let AI make them think they’re God who I myself am totally not like”. Retweet.
The reader correctly observes they are not like the people in the article, and incorrectly infers they wouldn’t let technology convince them of anything like that, and the people they know will never have to pay a price for it.
You want truth, honesty, reason. To make the world clearer, not bent to your own biases. Leave your identity at the door.
This obviously falls short. I mean, how about that time…
When you let dating apps convince you that you could find someone better/hotter, so you leave the person who actually cares how your day is going. When you let instagram convince you that you have more than five real friends, and you forget to check in on the one who’s clearly been depressed for the last year. When you let twitter convince you that you have somehow worked out the solution all international conflicts, and you cut out the last few people who want to help you doubt yourself.
That wasn’t sycophancy? What was it then? Capitalism?
Same process as the Anthropic paper, I’m afraid. Sure, that’s what pays the bills to silicon valley CEOs, but it’s not their fault you’re built that way. If people wanted connection then that’s what instagram would give you.
This is partly why I cannot get on board with Fisheresque analyses of capitalism induced anguish. The stuff along the lines of “we're told we're free to choose our destiny through individual effort and consumption, but when we inevitably struggle under systemic pressures, we blame ourselves rather than the system. Leading to a chronic anxiety about our personal failures and a joyless pursuit of success that never delivers genuine satisfaction, as systemic problems manifest as individual mental health crises, keeping us too depressed and anxious to” oh my god is it not even slightly possible that any of the things going wrong in your life could actually be your own fault?
Michel Serres once allegedly said that “the only modern question is what are the things you don’t want to know about yourself” and I allegedly couldn’t agree more. You see what the above (strawman) argument gets wrong is that miracle of consumer capitalism causes pain by giving you too little freedom. The truth is that it causes you pain by gracing you with far too much.
Everything you do is because you chose to do it. The heart you broke, the friend you let slip away. The additional screen time was just the obvious consequence.
But the unconscious handles this overwhelm by manufacturing artificial constraints, choosing partners who create drama, jobs that feel "beneath" you, or endless preparation rituals that justify never actually starting. You get to feel like a victim of circumstances rather than someone paralysed by their own abundance of choices, and the resulting cycles of self-sabotage, procrastination, and relationship chaos serve as perfect excuses for why you haven't actualised any of that infinite potential.
On the one hand this screams Sartre, but it’s also cognitive psych 101 in disguise. Confirmation bias, and its extended family. I’ve managed to squeeze the infinite possibilities of my life and the world into one narrow window, and my sanity depends on me reinforcing it, and never stepping outside of it ever.
But as evolution pushed regular cognition up that hill, I wonder if there awaited a steep cliff edge beyond it? A few months back Scott Alexander posted the dramatically titled ‘Psychopolitics Of Trauma’. An attempt at summarising his argument here:
Political extremism and hyperpartisanship may literally constitute a form of trauma rather than just metaphorically "making people crazy." Political partisans exhibit classic PTSD symptoms including being "triggered" by opposing views, cognitive distortion when processing political information, hypervigilance in detecting political threats everywhere, and paradoxically becoming "addicted to outrage" by compulsively consuming distressing political content. Political media and partisan culture function as "a giant machine trying to traumatise as many people as possible" to create repeat customers who are addicted to consuming outrage-inducing content, explaining why political discourse has become so dysfunctional. You can essentially become too far gone.
I’d be lying if I said I didn’t know people like this.
And it makes me think of the messiah at the start. And the F1 cars. And the Anthropic paper.
It all starts subtly. It all feels like you are pushing yourself to your best limits.
But are you teetering over the cliff? How would you know if you were?
Who would be there to see the signs?
IV.
Everyone is trying to work out what agents can/should do. Which I will admit is a fun problem that I also think about all the time.
One inevitability is the ever-present companion, in the spirit of Project Astra and the Android XR glasses that we saw the other week. These will obviously not be the final form of this and you can also add equivalent products from all of the main players. But it’s a useful lens.
The man on stage said this:
What if your AI assistant was truly yours , an assistant that learns you, your preferences, your projects, your world, and you are always in the driver's seat? And with your permission, you could share relevant information with it, making it an extension of you. We call this personal context. It will be uniquely helpful. This level of personalisation gives you a more proactive AI assistant, and this changes a lot. See, today, most AI is reactive. You ask, it answers. But what if it could see what's coming?
Don’t worry, I can see what’s coming just fine.
Two tasks for you:
Describe yourself in 4 sentences
Go through your last 400 Google searches
Do they correlate? Are you evidently interested in anything you don’t want to talk about? Do any of the things you proudly describe yourself as not manifest at all in how you spend your time?
Which is why agents are undeniably fascinating. What if it does understand you? What if it knows more about you than all your friends put together? What if its description of you doesn’t match your own? Would you want to know that?
It reads all your texts to your parents; do you want it to let you know you’re a shitty child? It notices you’re applying to jobs you have zero chance at getting; will it ego check you? It notices your political views don’t match the raw data about the issues you’re posting about; will it still draft that post for you?
I think the answers to those questions are pretty obvious, and it ain’t the fault of the model. Because the second it does attempt to let you know who you really are, you are closing that tab and switching to the competitor that didn’t make that design intervention.
If you’re a tech company, this is your call. Do you want a delusional userbase? Or do you want no users at all?
Our challenge is working out how to not have those be the only two options.
It is time to get to work.
This article was the first time I've been genuinely scared about the prospect of widespread AI. Super-intelligence functioning as a self-deluding defence mechanism—the shadow that thing would create.
Having seen people undergo psychosis I don't think it's "regular pattern recognition being pushed too far". It resembles something more like Alzheimer's or (ironically) a hallucinating AI. It's not that they make non obvious connections, it's that they see connections that aren't there at all, or say things that make no sense in any context.