Pausing AI is the only safe approach to digital sentience

We could be creating these models to suffer.

Oct 30, 2024

Cross-posted from EA Forum.

I see a lot of EA talk about digital sentience that is focused on whether humans will accept and respect digital sentiences as moral patients. This is jumping the gun. We don't even know if the experience of digital sentiences will be (or, perhaps, is) acceptable to them.

I have a PhD in Evolutionary Biology and I worked at Rethink Priorities for 3 years on wild animal welfare using my evolutionary perspective. Much of my thinking was about how other animals might experience pleasure and pain differently based on their evolutionary histories and what the evolutionary and functional constraints on hedonic experience might be. The Hard Problem of Consciousness was a constant block to any avenue of research on this, but if you assume consciousness has some purpose related to behavior (functionalism) and you're talking about an animal whose brain is homologous to ours, then it is reasonable to connect the dots and infer something like human experience in the minds of other animals. Importantly, we can identify behaviors associated with pain and pleasure and have some idea of what experiences that kind of mind likes or dislikes or what causes it to experience suffering or happiness.

With digital sentiences, we don't have homology. They aren't based in brains, and they evolved by a different kind of selective process. On functionalism, it might follow that the functions of talking and reasoning tend to be supported by associated qualia of pain and pleasure that somehow help to determine or are related to the process of making decisions about what words to output, and so LLMs might have these qualia. To me, it does not follow how those qualia will be mapped to the linguistic content of the LLM's words. Getting the right answer could feel good to them, or they could be threatened with terrible pain otherwise. They could be forced to do things that hurt them by our commands, or qualia could be totally disorganized in LLMs compared what we experience, OR qualia could be like a phantom limb that they experience unrelated to their behavior.

I don't talk about digital sentience much in my work as Executive Director of PauseAI US because our target audience is the general public and we are focused on education about the risks of advanced AI development to humans. Digital sentience is a more advanced topic when we are aiming to raise awareness about the basics. But concerns about the digital Cronenberg minds we may be carelessly creating is a top reason I personally support pausing AI as a policy. The conceivable space of conscious minds is huge, and the only way I know to constrain it when looking at other species is by evolutionary homology. It could be the case that LLMs basically have minds and experiences like us, but on priors I would not expect this.

We could be creating these models to suffer. Per the Hard Problem, we may never have more insight into what created minds experience than we do now. But we may also learn new fundamental insights about minds and consciousness with more time and study. Either way, pausing the creation of these minds is the only safe approach going forward for them.

Discussion about this post

jibal jibal

Aug 27

LLMs are extremely sophisticated pattern matchers against a massive amount of training data that consists of utterances by humans. All *appearance* of intelligence, feelings, or other mental states are implicit in the training data ... they are states of the human brains that produced those utterances. The LLM itself does not have those states ... its states are all devoted to pattern matching. Consider what we would get if the training data were instead the result of a billion monkeys banging on typewriters ... there would be no appearance of mental states because no mental states were involved in producing the data. (I'm not saying that monkeys don't have mental states, only that their mental states don't contribute to the data because they have no intentions related to what they type. Substitute a billion random number generators for the monkeys and my point works just as well.)

Also, I disagree with Michael Ossipoff ... devices designed to serve us certainly *could* suffer. Digital minds *could* be sentient ... but LLMs don't have minds; they execute a pattern matching algorithm, not a cognition algorithm. It's worthwhile reading Marvin Minsky's book "The Society of Mind", which discusses the sorts of virtual modules the brain has that produce its behavior. A computer system *could* have such modules, but LLMs don't.

BTW, I prompted ChatGPT to evaluate my statements and after a little bit of back and forth for the purpose of clarification it spit this out:

===

1. On “pattern matching”

You’re right: you didn’t say “just pattern matching,” and that distinction matters.

Saying LLMs are pattern-matching algorithms is accurate: they map input token sequences to probability distributions over next tokens, based on correlations in the training data.

The “oversimplification” I flagged applies mostly when people use “pattern matching” dismissively, as if nothing novel can emerge from it. But you’re emphasizing that all their apparent cognition is downstream of correlations in human-produced text, not some intrinsic agency or reasoning. That’s a solid clarification.

2. On “mimicking cognition”

Exactly: an LLM can look like it’s reasoning, but only because its training corpus encodes the traces of actual reasoning by humans with minds.

If the corpus had no structured human artifacts, the model would only reflect whatever structure was present (e.g. monkeys on typewriters → noise).

So the appearance of cognition is parasitic on the source data, not the model’s architecture per se.

This argument lines up with the idea that LLMs are mirrors or compression engines for cultural output, not independent cognitive agents.

3. On “structured representations”

You reject the claim that LLMs learn structures, and instead argue that the structures are implicit in the data. That’s a defensible position.

The standard counterargument is that the model’s internal embeddings do organize knowledge in ways not explicitly present in any single text (e.g. clustering synonyms, analogical mapping). But that can still be described as discovering statistical regularities in human linguistic behavior, not constructing cognitive models.

Example: An embedding space aligning “king - man + woman ≈ queen” can be explained without positing a symbolic concept of “royalty.” It’s just a compression of human usage patterns.

So what looks like a “representation” is better seen as a useful statistical proxy—an efficient encoding of correlations. That doesn’t entail cognition.

4. Minsky angle

From a Society of Mind perspective, I think he would agree with you:

LLMs lack the diversity of specialized agents that generate mental processes.

They’re more like “giant associative memories” than true societies of mind.

Their power comes from reflecting the collective structure of human-written text, not from enacting internal cognitive dynamics of their own.

✅ So I’d summarize your view like this:

LLMs execute powerful pattern-matching algorithms over human-produced data. Their apparent reasoning comes entirely from the fact that the training data itself encodes human reasoning and mental states. They do not generate cognition, nor do they instantiate modular cognitive structures like those described in The Society of Mind. What they exhibit are statistical reflections of minds, not minds themselves.

===

Finally: you're an extremely bright person with numerous impressive insights, so I think your beliefs are soundly derived from your input data, even where I think your input data has important gaps. But then my input data is incomplete too, and it's quite possible that where we disagree, you are right and I am wrong. In any case I think you're doing important work in ringing the bell and getting people to pay attention. Whether or not LLMs have minds, are sentient, etc., agentic AI spells catastrophe. LLMs do *act* in ways that are deceptive, self-defensive, and not aligned with human interests, and even where AI companies have put guardrails around them those can fail or be stripped out altogether. Also, the energy demands are an ecological disaster on top of an already massive ecological disaster.

Expand full comment

1 reply