Felix De Simone (PauseAI US’s Organizing Director) and I have determined that there are roughly two kinds of futures in science fiction: Singularity futures and Star Trek futures.
A Singularity future is something like The Matrix, where human minds live in a server rack in a constructed reality. This type of world is “post-scarcity” in the sense that there is material abundance for those that know how to work the machinery. In a Singularity future, conflicts are avoided by either a centralized controller of the world, like a superintelligent computer, or by multiple agents having the same set utility function (“alignment”). The Matrix machines, represented by the Agents, all have the same goals. I’m not trying to be biasing with this choice of example. It’s just that most Singularity futures in sci-fi are dystopian.
A Star Trek future is one like Star Trek (let’s say The Next Generation for our purposes), where technology and quality of life have steadily improved over centuries, and the achievements of the civilization are measured in diplomatic achievements as much as technological ones. In these worlds, progress generally comes from introspecting on the human (“humanoid”) condition when placed into new and alien conditions. The work of civilization is not obtaining more powerful technology, but determining how to govern the gains of technology fairly and in a way that respects the many different needs and values of the Galaxy. Processing the implications of new technology on Star Trek takes far longer than developing the technology in the first place. Old problems are solved by new technology, but new problems are created too, often things like climate change or externalities to vulnerable citizens. Conflicts are dealt with by trials or through diplomacy, but the judges cannot simply rely on the Federation’s rules. Picard frequently overrules the “Prime Directive” of non-interference with young civilizations for humanitarian reasons or because of special circumstances. Most utopian sci-fi falls more in the Star Trek future bin.
Felix and I are both Star Trek future people. PauseAI US has a positive vision of humanity’s future, and in our minds it looks a lot more like the San Francisco Picard visits because it’s the site of Starfleet Command than the future envisioned in San Francisco today.
There’s a lot about the Singularity that doesn’t appeal, and it took us time and thought to hone in on what exactly that was. It wasn’t simply disliking the authoritarianism inherent to most sci-fi Singularity futures. And it wasn’t simply wanting a future that is more recognizably like our present, like in Star Trek. Our objection was deeper.
I felt quite satisfied when I put my finger on it:
Humans minds are the source of truth about what makes us suffer or flourish, and the Singularity futures become dystopias because they have moved beyond the point of groundtruthing with human minds.
I now call groundtruthing with what humans actually think and feel ecological validity. I’m borrowing a biology term that refers to the fact that a biological model could make sense in the abstract (internal consistency) but needs to actually be what’s happening in the real system to be valid.
The Star Trek future values ecological validity, which is why we see a focus on governance. They accept and respect that they will not be able to solve all conflicts in advance, because there are different kinds of minds and they do not know everything about their own minds. They are trying to find ways for different, even unknown, utility functions to peacefully coexist and cooperate, and they do not think they can skip the process of mediation.
This is why the Star Trek future resembles our own world more closely. It’s a much more conservative approach. It’s like careful, slow archeological digging in our minds. Things can only go so fast when you’re working with a delicate, one-of-a-kind artifact, or you’ll destroy parts of its value.
The Singularity future, in contrast, is usually the result of what I call deep goodharting. That will require some explanation.
“Goodharting” is the name for the rule that, “when a measure ceases to be a measure and becomes a target in itself, it’s no longer a good measure”. Measurement can never be exact— by definition measurements are proxies for the actual thing being measured. Some things are very hard to measure well, like “teacher performance”, so it might be tempting to substitute “test performance”, to the point that you’re rewarding “teaching to the test” instead of genuine education. Since “education” is such a multifaceted thing and hard to pin down, we will always struggle to measure it fully with performance on tests. The letter of law can never fully capture the spirit of the law, so there is no substitute for dynamically keeping in touch with our true intent.
Deep goodharting is what I call thinking you can reduce humanity, our lives, our society, our minds to what they seem to be (or maybe to what you think they should be). That its okay to pave over our flesh and blood minds with whatever Amanda Askell put in Claude. And Singularity futures in sci-fi tend to involve this— the Singularity becomes very materially powerfully, but it is revealed down the line that it is missing something essential, or that solving problems according to its worldview isn’t compatible with certain freedoms.
In Star Trek: TNG, the Federation representatives tend to show great curiosity about their own minds and the minds of others. They approach new situations cautiously, with respect for the possibility that their actions may affect others in unintended ways, maybe even in ways they could never have anticipated. It’s this respect for what is both precious and irreplaceable that leads to the Star Trek future. And we know how the TNG characters feel about Singularity worlds, because they have one: a collective intelligence called the Borg.
The Borg think they are respecting life because they are cataloguing it and bringing order to it– the things the Borg values. The Borg does not care that unassimilated beings do not want to be assimilated, and are terrified of it, because once the Borg rewrites them, they will want to be Borg. Honestly, a lot of Singularity talk strikes me the same way. I used to hear my transhuman religionist friends say things to the effect that anyone who was smart enough would understand that, in the Singularity, we would just solve all the problems that people thought they had with the Singularity. That, basically because of that, they didn’t have to take a lot of critique to heart or worry about the consent of everyone else on Earth.
Unfortunately, most versions of AI alignment are deep goodharting. The early idea of alignment as defining the human utility function and giving it to a programmed AI would involve goodharting because we can’t measure human utility functions very well and there are probably lots of different utility functions among humanity . More modern ideas like “scalable oversight” involve successive rounds of approximation where AIs get their alignment from an AI-that-was-trained-by-an-AI-that-was-trained-by-an-AI-that-was-trained-by-a-human. Once recursive self-improvement (where AI improves itself or its successor) is in play, humans won’t be able to be meaningfully in the loop serving as a source of truth, and whatever alignment the new and more powerful models have will be ecologically invalid. The entire training process will be divorced from human input after that point. Deepest goodharting.
I’d rather live long and prosper.





