Being as wrong as can be on the so-called replication crisis of science

Jul 07 2014 Published by under Conduct of Science

I am no fan of the hysterical hand wringing about some alleged "crisis" of science whereby the small minded and Glam-blinded insist that most science is not replicable.

Oh, don't get me wrong. I think replication of a prior result is the only way we really know what is most likely to be what. I am a huge fan of the incremental advance of knowledge built on prior work.

The thing is, I believe that this occurs down in the trenches where real science is conducted.

Most of the specific complaining that I hear about failures to replicate studies is focused on 1) Pharma companies trying to cherry pick intellectual property off the latest Science, Nature or Cell paper and 2) experimental psychology stuff that is super truthy.

With regard to the former, cry me a river. Publication in the highest echelons of journals, publication of a "first" discovery/demonstration of some phenomenon is, by design, very likely not easily replicated. It is likely to be a false alarm (and therefore wrong) and it is likely to be much less generalizable than hoped (and therefore not "wrong" but definitely not of use to Pharma vultures). I am not bothered by Pharma execs who wish that public funded labs would do more advancing of intellectual property and serve it up to them part way down the traditional company pipeline. Screw them.

Psych studies. Aaah, yes. They have a strong tradition of replication to rely upon. Perhaps they have fallen by the wayside in recent decades? Become seduced to the dark side? No matter. Let us return to our past, eh? Where papers in the most revered Experimental Psychology journals required several replications within a single paper. Each "Experiment" constituting a minor tweak on the other ones. Each paper firmly grounded in the extant literature with no excuses for shitty scholarship and ignoring inconvenient papers. If there is a problem in Psych, there is no excuse because they have an older tradition. Or possibly some of the lesser Experimental Psychology sects (like Cognitive and Social) need to talk to the Second Sect (aka Behaviorism).

In either of these situations, we must admit that replication is hard. It may take some work. It may take some experimental tweaking. Heck, you might spend years trying to figure out what is replicable / generalizable, what relies upon very ....specific experimental conditions and what is likely to have been a false alarm. And let us admit that in the competitive arena of academic science, we are often more motivated by productivity than we are solving some esoteric problem that is nagging the back of our minds. So we give up.

So yeah, sometimes practicalities (like grant money. You didn't seriously think I'd write a post without mentioning that, did you?) prevent a thorough run at a replication. One try simply isn't enough. And that is not a GoodThing, even if it is current reality. I get this.


Some guy has written a screed against the replication fervor that is actually against replication itself. It is breathtaking.

All you need to hook your attention is conveniently placed as a bullet point pre-amble:

· Recent hand-wringing over failed replications in social psychology is largely pointless, because unsuccessful experiments have no meaningful scientific value.
· Because experiments can be undermined by a vast number of practical mistakes, the likeliest explanation for any failed replication will always be that the replicator bungled something along the way. Unless direct replications are conducted by flawless experimenters, nothing interesting can be learned from them.
· Three standard rejoinders to this critique are considered and rejected. Despite claims to the contrary, failed replications do not provide meaningful information if they closely follow original methodology; they do not necessarily identify effects that may be too small or flimsy to be worth studying; and they cannot contribute to a cumulative understanding of scientific phenomena.
· Replication efforts appear to reflect strong prior expectations that published findings are not reliable, and as such, do not constitute scientific output.

· The field of social psychology can be improved, but not by the publication of negative findings. Experimenters should be encouraged to restrict their "degrees of freedom," for example, by specifying designs in advance.

· Whether they mean to or not, authors and editors of failed replications are publicly impugning the scientific integrity of their colleagues. Targets of failed replications are justifiably upset, particularly given the inadequate basis for replicators’ extraordinary claims.

Seriously, go read this dog.

This part seals it for me.

So we should take note when the targets of replication efforts complain about how they are being treated. These are people who have thrived in a profession that alternates between quiet rejection and blistering criticism, and who have held up admirably under the weight of earlier scientific challenges. They are not crybabies. What they are is justifiably upset at having their integrity questioned.

This is just so dang wrong. Trying to replicate another paper's effects is a compliment! Failing to do so is not an attack on the authors' "integrity". It is how science advances. And, I dunno, maybe this guy is revealing something about how he thinks about other scientists? If so, it is totally foreign to me. I left behind the stupid game of who is "brilliant" and who is "stupid" long ago. You know, when I was leaving my adolescent arrogance (of which I had plenty) behind. Particularly in the experimental sciences, what matters is designing good studies, generating data, interpreting data and communicating that finding as best one can. One will stumble during this process...if it were easy it wouldn't be science. We are wrong on a near-weekly basis. Given this day to day reality, we're going to be spectacularly wrong on the scale of an entire paper every once in awhile.

This is no knock on someone's "integrity".

Trying to prevent* anyone from replicating your work, however, IS a knock on integrity.

On the scientific integrity of that person who does not wish anyone to try to replicate his or her work, that is.

*whether this be by blocking publication via reviewer or editorial power/influence, torpedoing a grant proposal, interfering with hiring and promotion or by squelching intrepid grad students and postdoctoral trainees in your own lab who can't replicate "The Effect".

28 responses so far

  • Richard says:

    I wrote a post aiming to address some of the other points in that essay here: it contains some... flawed reasoning, I would say.

  • Pinko Punko says:

    The failed replication is always suspect and therefor has no value. What does this say about the value of the initial result. And Cypress Hill plays in the background.

  • Barbara says:

    I read that whole paper earlier today. Staggering. What was with the swan argument? If somebody says there are only white swans and then you show them a black swan, you proved them wrong. Ok, but what does that have to do with experimental science? What if white swans turned out to be domesticated ducks viewed through binoculars turned around backwards? But swans are a terrible example. How about a better one. If somebody says they found cold fusion and then lots of people replicate their experiment and get zero evidence of fusion, then there is just no cold fusion! The result doesn't stand. This essay made it sound like this is not the case in psychology? Really? People can't get the same results but they just ignore it and keep teaching the original paper? Tell me that's not true. Maybe it's not fair to compare physics with psychology, but if you're going to call it science, them's the rules. Sometimes another expert doesn't even have to replicate it to call bullshit. No arsenic life, no deadlier hurricanes with feminine names. Let's move on.

  • Pinko Punko says:

    Oh I see, the math for a positive result means it has infinitely more value than a negative result. Gotcha. I would love to hear Janet Stemwedel's take on this.

  • Dave says:

    I think his second point is very important though. I think it is massively overlooked. Purely technical issues are at the root of many replication failures, even within a single lab or experiment. You brushed that point off a little lightly in my opinion and went straight to the outrage.

  • @Dave - but similar technical issues could also be responsible for the apparently successful original experiment.

  • drugmonkey says:

    Dave- I agree it is easy to fail to replicate. It is one of the problems I have with the ReplicationRleventy types. I want to know how hard someone worked at it and, indeed, something about what "oh that won't matter" decisions they made about the technical approaches and design.

  • As we say in Germany: Science is hard, or we'd call it soccer.

    Failures to replicate can ever only be a flag that something *may* be wrong, even if it's just the description in the methods section - bad enough.

    The beautiful experiment is the one where you *do* replicate the results, fix the flaws and show that the effect is no longer there, as e.g. in this experimental psychology study:

    Gerber, B. and Ullrich, J. 1999. No evidence for olfactory blocking in honeybee classical conditioning. J. Exp. Biol. 202: 1839-1854.

    And that is, of course, more than just difficult. Nevertheless, without replication, science is just another religion: unquestioned dogma.

  • Nick says:

    >Purely technical issues are at the root of many replication failures,
    >even within a single lab or experiment.
    I would be more sympathetic to that argument if the Abstract and Discussion sections of the articles writing up the original studies --- not to mention the press release --- didn't strongly imply, or explicitly state, that the study has exciting value for the whole of humanity. "Our findings [strongly] suggest that people " is how it's typically phrased. Now, if "people" (meaning, all people, everywhere, all the time) do indeed walk more slowly when they think, or are primed with thoughts, about old age, that's worth knowing. If in fact this only applies to undergraduates from my school, in my lab, with my favourite grad student timing them, well, perhaps the NY Times wouldn't be so interested in publishing that particular result. Claiming the first when you publish, and then falling back on the second (which is effectively what happens when you try to explain away multiple replication attempts with "You must be not following the method exactly") is just trying to eat your cake in such a way that you are still in possession of said cake afterwards.

  • Dave says:

    @Rosie: yes, of course, and I should have mentioned that.

    @Nick: I agree with what you're saying, but often those kind of statements are encouraged by journals and reviewers. If you leave them out, your work might be viewed as unimportant or, shock horror, incremental. It's part of the game.

    Press releases are a different story and I can't stand them personally. It's akin to a wide receiver celebrating uncontrollably when he catches a pass. It's our job to publish. Seems the bar for press releases is getting lower and lower as well.

  • AsianQB says:

    Your hate of Pharma Companies seems overblown. They fill a critical void to commercialize basic science into medical products, techniques and pills. So we can live through heart attacks, manage diabetes, lose weight without exercise (whenever that will happen) and the like.

    It is a failure of publicly funded research, Pharma or no pharma, if "Publication in the highest echelons of journals" is not reproducible, and we remain blase about it.

  • rxnm says:

    The black swan thing is philosophy department grade sophistry.

    What a tool.

  • The pharma studies is widely quoted because it is one relatively large scale attempt at reproducing work.

    "I am not bothered by Pharma execs who wish that public funded labs would do more advancing of intellectual property and serve it up to them part way down the traditional company pipeline. Screw them."

    Thing is, it does not just screw Pharma execs, it also screws PhD students and other young researchers who are trying to build on these flawed work. It screws research careers and research hopes. Pretty serious stuff.

    There may be specific problems in psychology but seeing the current state of my field (let's call it "bionano"), I would not give any lessons to phsychologists. I also have a foot in the stem cell field 😉

    These problems seem pretty systemic to me.

  • DrugMonkey says:

    I would suggest that if research careers are screwed b/c of one unreliable published result, everyone is doing it wrong. Major blame on the mentor, of course, wrt trainees.

  • As Richard Feynman wrote:

    "When I was at Cornell, I often talked to the people in the psychology department. One of the students told me she wanted to do an experiment that went something like this--it had been found by others that under certain circumstances, X, rats did something, A. She was curious as to whether, if she changed the circumstances to Y, they would still do A. So her proposal was to do the experiment under circumstances Y and see if they still did A.

    I explained to her that it was necessary first to repeat in her laboratory the experiment of the other person--to do it under condition X to see if she could also get result A, and then change to Y and see if A changed. Then she would know the the real difference was the thing she thought she had under control.

    She was very delighted with this new idea, and went to her professor. And his reply was, no, you cannot do that, because the experiment has already been done and you would be wasting time. This was in about 1947 or so, and it seems to have been the general policy then to not try to repeat psychological experiments, but only to change the conditions and see what happened."

    To be fair, Feynman goes on to write about this was beginning to happen in physics, with people being turned down for accelerator time if they wanted to repeat a result, but it was interesting that psychology was his first example!

  • drugmonkey says:

    Yeah, we all know Feynman was full of bullshittio anecdotes that "objectively" capped on his perceived inferiors. Your point?

  • I'm surprised you didn't go with the cringeworthy "how to pick up girls in a bar" advice that Feynman detractors seem to like to bring up. Yes, occasionally you find uncomfortable reminders in reading Feynman that he was a WWII-generation American male, just as you get similar reminders that Darwin was an upper-class Victorian one. But reading both Feynman and Darwin are immensely informative in general despite their occasional "un-PC" comments. In particular, the rats example I gave came from Feynman's famous "Cargo Cult Science" essay -- probably the best essay on science written in the 20th century.

  • drugmonkey says: can't see that it is equally based in dubious anecdote? And is therefore of zero probative value?

  • No. Because unlike the knitting girls example, the attitude of the rat-studying professor is something that we all have encountered. There's nothing dubious about it. That's why "Cargo Cult Science" is a famous essay and the phrase is even known and used by people who have never read the essay itself.

    BTW, if you read the anecdote about the girls knitting in context, he is actually making a valid point that we need to give examples in teaching math and science that students can relate to. He just did it in a unusually clumsy and sexist way (probably even for the time), and there is a reason why that essay didn't become one of his more famous ones.

  • drugmonkey says:

    He's (and you are) generalizing about all of a sub field based on some stupid anecdote when reading the actual journal articles would tell a different tale. Go read all of 1947 in a few Exp Psych journals and get back to me with cited demos of your silly assertion.

  • Riiight. As if your assertion that Cognitive and Social psychology are (in your words) "lesser" branches of psychology is no doubt based on reading all the journals in those fields rather than just knowing of a few crappy studies.

  • drugmonkey says:

    That's called "snark", genius. But yes, *informed* snark. I have indeed, in the course of my post secondary education and career read a few forests worth of Psychology papers. Your trite quotation of an established assbag suggests you have not.

  • Dave says:

    Informed snark. I like that.

  • […] couple of people have tweeted/blogged (EDIT: additional posts from Neuroskeptic, Drugmonkey, Jan Moren, Chris Said, Micah Allen EDIT 2: more, by Sanjay Srivastava, Pete Etchells) about a […]

  • Most of the shortcomings of Mitchell's piece have already been pointed out, but here are my two cents:

  • […] they mean to or not, authors and editors of failed replications are publicly impugning the scientific integrity of their […]

  • s klein says:

    This is truly a sad, sad comment on psychological "science". I only can hope (wish?) that the critical response to Mitchell's "ideas" is representative of how psychology understands the nature of empirical work.

  • Jonathan says:

    Even when replication quite obviously disproves something, it's amusing to see BSD authors double-down and refuse to own up and say "you know what? we got it wrong." For a marvelous example of this (and science in action generally) find an hour and read through this post at Lior Pachter's blog and then the ensuing thread. And take particular note when the President's main science advisor shows up in the discussion...

Leave a Reply