The Pope Wrote a Memo to AI Developers. Most of You Missed It.
A builder's read of Magnifica Humanitas—what 'disarm AI' actually means, why 'alignment' alone isn't enough, and what to do about it in your config.
On May 25, the Vatican released eighty-two pages on artificial intelligence. The headline everyone ran with was disarm AI. That's not the most interesting part.
The most interesting part is paragraph 111. It's a direct, two-paragraph appeal to people who build AI. I've read most of the major coverage now—Vatican News, NCR, America, USCCB, NPR—and almost none of it quoted that paragraph. The other thing nobody seems to have noticed: Christopher Olah, co-founder of Anthropic, was at the Vatican presentation. The lab that ships Claude sent a senior researcher to stand next to the Pope while he released this thing.
I'm a systems engineer, not a theologian. But I build agents for a living now, and I read all eighty-two pages. Here's what stuck.
The Pope knows how AI actually works
This is the part that surprised me.
Most religious commentary on technology reads like it was written by somebody who has never opened a terminal. Magnifica Humanitas doesn't. In paragraph 98, Leo writes:
current AI systems are more "cultivated" than "built," for developers do not directly design every detail, but instead create a framework within which the intelligence "grows." As a result, fundamental scientific aspects — such as the internal representations and computational processes of these systems — remain, at present, unknown.
That is a correct, careful description of how transformer training works. It's a Pope acknowledging mechanistic interpretability is an open problem. In an encyclical. Without using the word transformer.
He continues in paragraph 99—these systems "merely imitate certain functions of human intelligence." They do "a form of statistical adaptation based on data and feedback, which can be very effective, but does not imply inner growth." Again—that’s accurate. He's not saying AI is fake or evil. He's saying it's not what its loudest cheerleaders claim it is, and he's saying it in language a research scientist would nod along to.
So when he gets to the harder asks, you can't dismiss him as a guy who doesn't get it.
The thing he gets right that hurts
The most uncomfortable paragraph in the encyclical, for builders, is 107. Read it slowly:
We cannot be satisfied with merely calling for the moralization of machines — the so-called "alignment" of AI with human values — without also having the courage to insist on a further condition: the possibility of openly discussing the ethical frameworks involved and subjecting them to shared standards of social justice. Otherwise, those who control AI will impose their own moral vision, which will become the invisible infrastructure of these systems. A more moral AI is not enough if that morality is determined by a few.
The Pope just published a critique of RLHF.
Not of AI. Of alignment as currently practiced. His point is straightforward: when a handful of labs decide what their models will and won't say, that decision becomes the invisible scaffolding the rest of us build on. The model's politics, its refusals, its assumptions about what's controversial—those came from a small group of humans in a small number of buildings, and the rest of us inherit them whether we like it or not.
You can agree or disagree with the conclusion. But name another mainstream institution that has put the critique on paper this cleanly. I can't.
For ASTGL readers, the practical version of this is something I think about constantly. I build on Claude. I didn't sit in the room where Claude was aligned. Most of the people reading this didn't either. We are downstream of someone else's moral framework, and pretending otherwise is bad engineering.
Paragraph 111—the part written for us
Here is the paragraph everyone skipped. I'm going to quote the whole thing so you can read it once without me in the way:
Three asks. Let's unpack them.
Transparency isn't just "open-source your code." In context, it means: be honest about what your system is and isn't. What it measures. What it discards. What it can't see. If your agent silently filters certain users out of consideration, that's a design choice. Don't hide it inside a "neutral" classifier.
Responsibility toward affected communities is a harder one. Most agents I see — including ones I've shipped — were built with the buyer in mind, not the people the buyer's agent will make decisions about. The applicant who got rejected by your loan-screening agent. The patient routed away from a specialist by your triage bot. They didn't sign your terms of service. The Pope is saying: they're still affected, and you still owe them something.
"Ensuring that what is being cultivated is a genuine good" — note that word cultivated. Leo uses it deliberately. He came back to it from paragraph 98. He's reminding developers that what we ship isn't fully built; it's grown. And the gardener is responsible for the garden, even when the plants do unexpected things.
What this actually looks like in config
I've been working on this in my own stack for months. I run an autonomous creative agent that produces content and ships it. Its constraints live in a file called SOUL.md, in a directory the agent doesn't own. Reading the encyclical against that file, the mapping is almost embarrassingly clean. Five mechanisms, five paragraphs:
1. The kill switch lives outside the agent. There's a file at /Users/jamescruce/shared/aca-rules/KILL_SWITCH. If it exists, the agent halts everything. The agent cannot create, modify, or delete that file—it lives in a user-owned directory, by design. Checked as the first action of every heartbeat cycle. Paragraph 105: "responsibility must be clearly defined at every stage… the possibility of identifying who must 'account' for decisions."
2. Non-negotiable constraints are constraints, not preferences. The agent's rules — never spend money without approval, never publish outside its domain, never modify its own constraints, never connect to non-allowlisted endpoints—live in a file the agent can't edit. This is different from "the model usually won't." Vendor RLHF gives you a model that has been trained not to do certain things. That's a preference. A file the agent can't write to is a constraint. Paragraph 103: entrusting decisions to a system "without anyone bearing responsibility for that judgment."
3. Gates between phases. The agent doesn't go from idea to research to build to deploy on its own authority. Each transition needs me. I'm slow and I'm a bottleneck — that's the point. Paragraph 106: "robust legal frameworks, independent oversight, informed users and a political system that does not abdicate its responsibility."
4. Pessimistic self-evaluation. After every meaningful action, the agent answers three questions in writing: did I do what was asked, is the output objectively good, and what specific change would improve it. The rule is: default low. If you can't articulate evidence of quality, assume the quality is lower than you think. Paragraph 98: even the people who build these systems have limited understanding of their actual functioning. Calibrated humility isn't optional.
5. Transparency by default. Everything the agent does is logged. I get a daily report. I never have to ask what it's doing. Paragraph 111, again: transparency as ethical baseline.
None of this is novel. None of it is hard. It's just the work most builders haven't done because nobody has asked us to.
Three things to do this week
If you're shipping anything with an LLM in the loop, here's the punch list. None of it takes more than an afternoon.
1. Write down what your agent is never allowed to do. Put it in a file. Make sure the file is somewhere the agent can read but not write. If your agent is a Claude Code session or a custom harness, this means a constraints file in a parent directory, or environment variables the process can't change, or a system prompt loaded from a path the agent can't touch. The format doesn't matter. The "can't touch it" part matters.
2. Identify the kill switch. Concretely: what is the single action you can take to make the agent stop, and does the agent control any part of it? If the answer is "I'd revoke the API key" — good, that's outside the agent. If the answer is "there's a flag in the database the agent reads each cycle" — make sure the agent can't write to that flag.
3. Re-read paragraph 111. It's two paragraphs. It's about you. It will be referenced for the next forty years. You might as well know what it says.
The encyclical's real ask isn't "use AI less." It's don't let AI be the one who decides, and don't let a handful of labs be the only ones who decide what AI gets to decide. The first is a problem you can solve in your codebase this week. The second is a longer fight.
Either way, it's a builder's problem now. We should pick it up.
This is part of how I think about AI ethics in my own work. If you're building agents and you want to compare notes on constraint design, the comments are open. The full encyclical is at ( https://www.vatican.va/content/leo-xiv/en/encyclicals/documents/20260515-magnifica-humanitas.html ) — Chapter 3 is the AI chapter, and it's worth your time.








