Testability Is a Scientific Virtue

My least favorite kind of argument has this shape:

This theory is consistent with everything we have observed.

Fine. So is the theory that the universe was created last Thursday with fake memories, fake fossils, and a fake browser history. Being consistent with the past is not nothing, but it is not enough.

A model can fit known facts because it is true. It can also fit them because it was bent around them. Those look annoyingly similar until the model has to say something in advance.

This is where testability becomes more than a philosopher's word. It is a request for the theory to put some skin in the game. What would make it less likely? What observation would be awkward? What result would force supporters to say "that version is wrong" instead of adding one more adjustable clause?

I am not saying speculative work is useless. That would be silly. Speculation is how you get from current measurements to new guesses. Some guesses need years before anyone knows how to test them. Some are valuable as mathematics before they are valuable as physics.

But labels matter. A metaphor is not an empirical result. A research program is not a confirmed theory. A beautiful framework that can accommodate anything has not yet done the hard part.

Software has smaller versions of the same sin. A postmortem explains the outage but does not prevent the next one. A dashboard explains last quarter and fails on Monday. A model wins the benchmark because the benchmark leaked.

The story sounded good. Reality was not impressed.

So the question I keep coming back to is crude but useful: what would change your mind?

Back to writing