The Convenient Myth of Experimental Rigor in Science

May 05, 2024

I.

What’s the relationship between “Science” and “The Scientific Method”?

There is some disagreement on the definition of “science”. This is partly political. There is some prestige associated with the label of “science,” and there are domains of study (e.g. astrology, homeopathy, etc.) for which I believe the general consensus is that they are “not science,” but for which some of their practitioners advocate for reclassifying them as science.

But there is also a value-neutral epistemological component. Yes, I regard “science” fairly highly, but I also regard “mathematics” highly, and I do not consider mathematics to be a form of science.

There is less disagreement on the definition of “The Scientific Method.” The method is typically broken down into a sequence of steps, and while different pedagogical sources may present a different number of and labeling for these steps, there is a universal consensus that (at least) one of these steps involves performing an experiment to try to falsify your hypothesis.

There is a sense in which experimentation is paramount in science: if the experiment falsifies the theory, then it does not matter how beautiful or elegant the theory is; the theory must be rejected.

So then, is it possible to “do science” without using the scientific method? I mean this to be a question about the definition of the term “science”. What I’m asking is: are there “activities” that are, by a wide consensus, labelled “science”, but which do not adhere to our common notion of “the scientific method”?

Media generated by meta.ai — AI generated image. Prompt: A contemplative scientist sits at a cluttered desk, surrounded by scientific equipment like beakers, microscopes, and complexity theory models.

II.

In 1915, Albert Einstein formulated the theory of general relativity. One of the predicted consequences of the theory was that the path of light would be curved by gravitational fields. Specifically, we could observe the apparent position of distant stars as moving due to the gravitational field of our sun. This experiment was only performed and confirmed four years later in 1919 by Arthur Eddington.

(According to a biography by Abraham Pais,) when the news of Eddington’s experimental result arrived, a student asked Einstein what he would have said if Eddington’s result had contradicted Einstein’s theory. Einstein replied “Da könnt’ mir halt der Liebe Gott leid tun. Die Theorie stimmt doch.” (“I would feel sorry for the good Lord. The theory is right.”)

Was Einstein a scientist? Was he “doing science”? On the one hand, we could argue that by the 20th century, physics had become sufficiently complex that we’ve observed the necessity for people to specialize in different aspects of the scientific method: some people (like Einstein) came up with the theories and other people (like Eddington) tested the theories1. Under this argument, despite the fact that Einstein did not perform a “full cycle” of the scientific method, we would still say that he was performing a step within the scientific method, and simply relied on others to perform the other steps.

I have two issues with this.

First, under that conception of what counts as science or not, we could argue that astrology and homeopathy etc. are also sciences. Their practitioners are simply doing the “hypothesizing” part (hypothesis: “maybe being born under the Virgo sign means you’re unusually generous”), and leaving the experimental confirmation part to future scientists.

Second, Einstein’s attitude implies that he did not think the experimental verification was necessary at all. Would Einstein really have continued believing in his theory despite empirical experimental evidence leading to the contrary?

III.

It’s a little tricky to determine the “start date” of quantum physics, and specifically the idea that it is fundamentally probabilistic (no hidden variables, etc.) Perhaps I would go with 1926, which is when Max Born formulated the "Born rule.” The Born rule states that the probability that a measurement yields a given result is equal to the square of the amplitude of the corresponding wavefunction.2

At any rate, suffice it to say that theories about this “fundamentally probabilistic” aspect first developed sometime during Einstein’s lifetime (born 1879), and continued on after he died (in 1955). Some experimental results were available to Einstein, and some were not (e.g. because they occurred after he died).

It’s difficult for me to determine precisely which experimental results Einstein was aware of. In 1949 (while Einstein was alive), physicist 吳健雄 (Chien-Shiung Wu) experimentally confirmed (failed to falsify?) photon entanglement, something Einstein believed to be impossible (in a 1947 letter to Max Born, Einstein wrote “the theory cannot be reconciled with the idea that physics should represent a reality in time and space, free from spooky action at a distance.”) Was Einstein aware of 吳健雄’s results? I could not find evidence in either direction.

However, it is clear that there was progressively stronger and stronger experimental evidence in favor of the fundamentally probabilistic nature available to Einstein. And yet Einstein seems to have rejected the experimental evidence. Abraham Pais’s biography again provides relevant quotes of Einstein’s writings:

In 1949: "I am convinced that the ... statistical [quantum] theory ... is superficial and that one must be backed by the principle of general relativity. " And in 1954: "I must seem like an ostrich who forever buries his head in the relativistic sand in order not to face the evil quanta."

And in a 1952 letter to Max Born, Einstein provides his thoughts on how the general population seemed to be very interested in Freundlich’s experimental results that deviated from what Einstein’s theory would predict:

Freundlich doesn’t affect me one little bit. […] It is really quite strange that humans are usually deaf towards the strongest arguments, while they are constantly inclined to overestimate the accuracy of measurement.

IV.

So what exactly am I griping on about? Is my complaint that Einstein’s attitude towards experimentation is far too common?

Actually, I want to highlight a somewhat subtler problem. I suspect that most people (including most scientists) believe that science is more experimentally based than it actually is.

I’m not referring to the replication crisis. That’s a whole other can of worms and I don’t want to get into that in this post.

I’m referring to something closer to the Mpemba effect, or rather the historical folk lore associated with it. Let’s say you have two cups of water—one moderately hot at 35°C (95°F), and one boiling hot at 100°C (212°F)—and you place them both in your home freezer. Which one will freeze first? Most people will say that the colder one will freeze first. Intuitively, it seems like the colder one has “less temperature distance” to travel until it reaches freezing, so it “get there first.”

The Mpemba effect claims that the opposite is true: The hotter cup of water will freeze first.

Now there’s actually some controversy over whether the Mpemba effect is true (whether the experiment is reproducible). I actually don’t want to get into that. I want to discuss the folk lore behind how this “effect” was discovered.

The effect is named after Erasto Mpemba. According to the legend, when he was 13 years old, he and his classmates would place their drinks in the school’s freezer. He noticed that the hot drinks would freeze faster than the cold drinks. When he asked his physics teacher why this happened, the teacher said the result was impossible and he must have made a mistake. The class laughed at his stupid question. Then Mpemba wrote to a science university professor asking the same question. The professor did not believe in the results that Mpemba reported, but saw a pedagogical opportunity to teach about the scientific process, and so conducted the experiment. Much to his surprise, he was able to replicate the Mpemba effect. And so the professor and Mpemba published a paper together describing this effect, vindicating Mpemba in front of his classmates.

I don’t know how much of this legend is literally true and how much of it is embellishment to present certain people in a certain way for entertainment value or narrative cohesiveness. But I have anecdotally observed a tendency for people to hold the following sets of beliefs:

You’re a fool if you don’t experimentally test your beliefs.
My beliefs are part of the scientific consensus, so no need for me to experimentally verify it.

This is “fine” if every belief in the scientific consensus were experimentally tested. There’s some tension if this isn’t true though.

V.

The Mpemba story may or may not be true, but here’s a story that you can independently verify is true: Geoffrey West is a theoretical physicist and former president and distinguished professor of the Santa Fe Institute. He’s been honoured as one of Time magazine's “Time 100.” He authored of the book “Scale: The Universal Laws of Life, Growth, and Death in Organisms, Cities, and Companies” which is an Amazon best seller in its category. I’m giving you all this background on him to give you Bayesian evidence that he’s not some rando, but rather might be an exemplary member of the physics community.

In chapter two of his book, he writes of Galileo:

He is perhaps best known for his mythical experiments dropping objects of different sizes and composition from the top of the Leaning Tower of Pisa to show that they all reached the ground at the same time. This nonintuitive observation contradicted the accepted Aristotelian dogma that heavy objects fall faster than lighter ones in direct proportion to their weight, a fundamental misconception that was universally believed for almost two thousand years before Galileo actually tested it. It is amazing in retrospect that until Galileo’s investigations no one seems to have thought of, let alone bothered, testing this apparently “self-evident fact.”

It turns out Galileo almost certainly never actually performed this experiment. As a matter of historical fact, we can’t directly determine whether or not Galileo performed the experiment without the use of a time machine, but we have a couple of strong pieces of evidence implying that he never performed it.

First of all, from a pragmatic and common sense perspective, he probably would not have been allowed to perform the experiment. The Leaning Tower of Pisa is a prestigious and sacred place, and the local authorities at the time would have likely prevented him from conducting the experiment.

Second, Galileo never actually claimed to have done any such experiment. Instead, Galileo provided theoretical arguments for why two objects with different mass would likely fall at the same speed: Imagine you have two objects, a heavier one and a lighter one, and the heavier one fell faster than the lighter one. If you attached these two objects together, you would form a yet-even-heavier object. Galileo argues3:

If two bodies of different weights and different rates of fall are tied by a string, does the combined system fall faster because it is now more massive, or does the lighter body in its slower fall hold back the heavier body? The only convincing answer is neither: all the systems fall at the same rate.

Galileo’s arguments are indeed persuasive (at least to me). Perhaps he felt they were so persuasive that he saw no need to actually go out and do the experiment.

The final piece of evidence I have that indicates Galileo likely never performed the experiment, is that the experiment would likely have falsified the theory. In a 2014 paper4, Bo Jacoby actually performs the experiment and observes the heavier object falling faster than the lighter object:

We ﬁnd a difference in impact time of 0.109 s, for otherwise nearly identical objects of masses differing by about a factor 3: m-light = 57.52 g and m-heavy = 173.62 g, dropped from a height of 23.192 m, about half of that of the Leaning Tower of Pisa.
[…]
The difference between the time of the two balls hitting the ground can clearly be heard and even seen with the unaided eye when performing the experiment.

Apparently, at the given heights and masses, air-drag becomes a significant factor.

VI.

I don’t intend to specifically pick on Geoffrey West; I believe his attitude is widespread. He expresses mock surprise (“it is amazing [emphasis added] […] no one seems to have thought of, let alone bothered, testing this”) that those who held beliefs outside of the current scientific consensus didn’t think to or bother to test their hypotheses, while remaining silent on the instances of those promoting beliefs within the scientific consensus (Einstein, Galileo) not testing their hypotheses.

So what can we do about this?

I don’t think the answer is to demand that everybody must test their beliefs before accepting them.

I hold plenty of “scientific beliefs” without having tested them: I believe the earth is round, despite not having attempted to reproduce any of the experiments demonstrating that; I believe the earth goes around the sun; I believe that two objects with different masses would fall at the same speed in the same gravitation field in a vacuum.

It’s simply impractical for each individual person to replicate every experiment ever performed instead of relying on the assumption that “someone else has probably verified this already.” It also doesn’t make much sense from a cost-effectiveness point of view: Let’s say I’m wrong, and it turns out objects with different masses actually do fall at different speeds, even in a vacuum. Okay, so what? There’s probably no decision in my life that I would make differently upon learning that piece of information (the situation might be different if I were, say, an aircraft engineer). It simply doesn’t pay off for me to experimentally verify these particular beliefs.

So instead, I think many of us should simply be humbler about our epistemic status. I don’t have any reason to believe that the Aristotleans, who thought heavier objects fall faster, were any less rigorous in their scientific thinking than the Galileans were (aside from the fact that the Aristotleans apparently came to the wrong conclusion). The Aristotleans (just like the Galileans) probably had fleeting thoughts of how they might experimentally verify their hypotheses, but then discarded them as too inconvenient to actually carry out (how to get past those guards at the Leaning Tower of Pisa?) And besides, their hypothesis seemed so self-evident that there really was no need to test it anyway.

Thank you for reading Nebu’s Newsletter. This post is public so feel free to share it with a colleague.

Einstein said: “It’s just not that easy today to experiment properly. Even I myself … have not performed experiments any more for decades now—and it can be added, experimenters hardly ever construct theories today; it is almost impossible to completely master the methods on both sides of the field, as Heinrich Hertz, for example, could still do.” Einstein's attitude towards experiments: Testing relativity theory 1907–1927, 1992, Klaus Hentschel. https://www.academia.edu/20607093/Einsteins_attitude_towards_experiments_Testing_relativity_theory_1907_1927

In response to the Born rule, Einstein famous replied “Quantum mechanics… delivers much, but does not really bring us any closer to the secret of the Old One. I, at any rate, am convinced that He does not play dice.”

Galileo’s original writing was, of course, in Italian. I can find many secondary sources providing an English translation similar to what I provided, about attaching two masses together to form a heavier pass etc. However, I was unable to find a primary source of Galileo’s original writing in Italian. Sorry.

Laboratory test of the Galilean universality of the free fall experiment, 2014, Bo Jacoby. https://www.academia.edu/25184940/Laboratory_test_of_the_Galilean_universality_of_the_free_fall_experiment