That study of peer review perplexes me

Apr 24 2015 Published by under Grant Review, NIH, NIH Careerism, Peer Review

I just can't understand what is valuable about showing that a 1%ile difference in voted score leads to 2% difference in total citations of papers attributed to that grant award. All discussions of whether NIH peer review is working or broken center on the supposed failure to fund meritorious grants and the alleged funding of non-meritorious grants. 

Please show me one PI that is upset that her 4%ile funded grant really deserved a 2%ile and that shows that peer review is horribly broken. 

The real issue, how a grant overlooked by the system would fare *were it to be funded* is actually addressed to some extent by the graph on citations to clearly outlying grants funded by exception.

This is cast as Program rescuing those rare exception brilliant proposal. But again, how do we know the ones that Program fails to rescue wouldn't have performed well?

23 responses so far

  • Grumble says:

    I take issue with one of the study authors' conclusions:

    "Our analysis focuses on the relationship between scores and outcomes among funded
    grants; for that reason, we cannot directly assess whether the NIH systematically rejects high potential applications. Our results, however, suggest that this is unlikely to be the case, because we observe a positive relationship between better scores and higher-impact research among the set of funded applications."

    Just because there is a relationship (which, as JB points out, is a SHALLOW one that accounts for only a small part of the total variability) does not mean that there aren't plenty of "high potential applications" (ones that would have produced a lot of papers and citations) that didn't get funded.

    The relationship between priority score and number of papers is completely unsurprising: grants that generate "buzz" (because they use a new technique or test a popular idea) will fund experiments that also generate buzz among paper reviewers. But if you believe that science often advances farthest when someone proposes an unexpected idea, often with very little evidence to support it, and then tests it in a clever way... well, those kinds of grants have a hard time at the NIH. The data don't capture the agonizing submit/reject/rinse/repeat cycle that an begins with a great, unusual idea and only ends in funding (and publication) after years of grantwriting.

  • drugmonkey says:

    And this is exactly the way NIH is going to use this study, Grumble. To claim the relationship for funded grants proves something about the unfunded grants. Which were not considered, for the most part. The only part that touches on this issue, the far-outlying exception funded grants - actually questions their assumption/conclusion.

  • qaz says:

    What this study shows is that there is a correlation between score and productivity. But if you look at the actual distributions they are *incredibly* noisy. So this is confirming what we've known all along - reviewers do a reasonably good job but there's lots of noise around it. I don't think it changes anything that we've discussed on this blog for years.

    The real issue is whether NIH funding decisions should simply be "the best science available" on the table in front of them (*) or if NIH funding decisions should reflect other factors as well, such as longer term returns or diversity in the scientific enterprise or other factors.

    * Assuming the noise is irreducible, which I suspect it is not!

  • Vanya says:

    Have you seen this one?
    http://www.sciencedirect.com/science/article/pii/S0048733315000396

    -Vanya (lurker turned commenter)

  • drugmonkey says:

    Interesting, thanks for linking that one.

  • Pinko Punko says:

    What does it mean that numbers are only through 2008? Not that much relative to the current climate?

  • DJMH says:

    I agree with all your points about this study not helping us understand non-funded grants, but isn't it at least a bit of a damper to the "just randomly fund applicants" crowd?

  • drugmonkey says:

    I think most of the "just randomly fund" advocates couple that to some peer review rank cutoff no? They just want random allocation within a subset, not all. So same problem for this argument.

  • drugmonkey says:

    And I really don't understand how the number of publications per grant in the new study is so different from what Berg reported at NIGMS so long ago https://loop.nigms.nih.gov/2010/09/measuring-the-scientific-output-and-impact-of-nigms-grants/

  • lurker says:

    In 2007: Mario Capecchi's Noble Prize speech:
    In 1980, I submitted a grant application to the National Institute of Health proposing to test the feasibility of such gene targeting in mammalian cells. These experiments were emphatically discouraged by the reviewers.....Despite the rejection,I decided to put all of our effort into continuing this line of research. This was a big gamble on our part.....By 1984, we had good evidence that gene targeting in cultured mammalian cells indeed
    occurs(19). At this time, I submitted another grant application to the same National Institute of Health study section that had rejected our earlier proposed gene targeting experiments. Their response was “We are glad that you didn’t follow our advice.”

    In 2009: NYtimes: Grant System Leads Cancer Researchers to Play It Safe
    http://www.nytimes.com/2009/06/28/health/research/28cancer.html?pagewanted=all

    In 2012 Nature Commentary: Research grants: Conform and be funded
    http://www.nature.com/nature/journal/v492/n7427/full/492034a.html

    Now 2015: This Drivel: Peer-review panels select the best science proposals?
    Rockey-inspired Conspiracy?

  • datahound says:

    I think this is an important analysis largely due to the general principle that it is always wise to characterize your assay. It is important both for what it does show (that percentiles do predict to some extent subsequent productivity metrics and that this effect does not go away when you correct for a number of factors such as previous publication history, institutional affiliation, etc.) and for what it does not show (e.g. a sharp decline in productivity with increasing percentile).

    The title of the paper is quite inappropriate. The fact that the relationship persists even correcting for PI characteristics does not mean that these factors do not play a role in scoring.

    The big question regarding unfunded applications is, of course, unanswered and likely largely unanswerable without a directed experiment.

    PP: The reason for the cut-off at 2008 is the need to have time for publications and citations to accrue. In principle, one could look at narrower time windows to look for trends, but it is unclear whether that would provide any insights (but it is probably still worth trying).

    DM: The numbers of publications per grant is hard to judge from the paper itself. I eyeballed the curve in the paper and overestimated one of the parameters. There is more information in the supplementary information that suggests that the curves look much more like the ones from my earlier analysis. I will update the Datahound post once I get this sorted out.

  • Fastball says:

    I feel like there are gaps in this kind of analysis, even at a larger 'population level'. For example, I think the idea that "money begets money" is overlooked. If I'm well funded, can't we make the argument that (a) my pilot data set can be much more extensive and sophisticated, (b) I have a stronger cadre of staff, and (c) I am publishing more; all of which I have to assume are part of higher percentile scores.

    If we assume that there is a pretty solid correlation between the amount of funding and the rate of publishing, and that existing funding associates with improved scores on subsequent grants, then existing funding might be an important caveat in assessing the number of publications linked to a particular grant (this assumes that grant dollars are fluid in terms of projects.)

    Does this factor into the conclusion that percentile scores are predictive of a grant's impact?

  • datahound says:

    Information about publications per grant is in the supplementary information. I have updated the Datahound post (http://datahound.scientopia.org/2015/04/23/science-article-with-an-analysis-of-nih-peer-review/ )

  • […] Next? (paper here) Genome study reveals lonely end for the world’s woolly mammoths That study of peer review perplexes me (also see this […]

  • Philapodia says:

    McKnight's at it again. Apparently he doesn't read Datahound.

    http://www.asbmb.org/asbmbtoday/201505/PresidentsMessage/

    He suggests that there should be two types of NIH grants, one that is PI blinded and should be a nuts-and-bolts type project to fulfill IC needs (read: riff-raff), then an investigator-oriented grant mechanism for the vertically ascending crowd.

    He also says that because his department has consistently chosen "winners" it shouldn't be that hard for NIH to do so as well.

  • drugmonkey says:

    that guy. at least he references Berg's analysis but I like how he is at pains to point out the NIGMS dataset was only for 360 grants. Somehow he failed to notice the followup study from NIH that covered some 55 thousand (33K scored) applications.
    https://nexus.od.nih.gov/all/2011/03/08/overall-impact-and-criterion-scores/

  • Philapodia says:

    McKnight has probably been getting triaged recently and doesn't know why, so he is catching up on what's been going on to the rest of us for years. Just like the Jedi council having no idea what's been happening under their noses. When you're flying so high it's hard to see the crap you're flying over, so we should cut the guy some slack when he seems clueless. Just kidding!

  • drugmonkey says:

    Maybe his Circular Prestige Reasoning "winners" have been struggling as well? A lot of his local survey-respondants seem to be getting scores above 20 for example.....

  • Philapodia says:

    I think it could be worse than it appears. There is probably a good deal of self selection going on with these data. If someone really got a dog turd of a score, would they want to send the results to McKnight for "analysis" and look bad? I have a feeling that there is an over-representation of scored grants in this cohort. If McKnight used all of the data from CSR about his institution (I doubt he tried really hard), I have a suspicion that his institution wouldn't look as good.

  • jmz4gtu says:

    "The NIH wants the research plan to be sound but is largely unconcerned by the qualifications of the scientist. "
    -Hhahaha. The NIH-opacylpse deniers at Harvard are probably saying it's because working at UT Dallas means you ARE riff-raff, which is why, predictably, he doesn't want to play up the "environment" angle in his grants.

    "If someone really got a dog turd of a score, would they want to send the results to McKnight for "analysis" and look bad?"
    -He could have just created a google spread sheet and have people edit it anonymously.

    So do we think his repeated attempts to cut out a special little slice of funding for "superior" researchers is hubris, self-interest, or possibly misguided idealism?

  • drugmonkey says:

    Having run across many older researchers not too dissimilar in attitude I would say that McKnight really believes he's deserving, people he likes are deserving and nobody else is deserving. Since the deserving scientists are now, finally, having a teensy bit of bother keeping their funding topped up as they like, clearly there is now a Problem to be Identified.

    He believes it.

  • Philapodia says:

    Evidently we struck a nerve with some BSDs on McKnights post. One Marius Clore (474 pubs, h-index 112) states that he fully supports McKnights approach. However, Clore is NIH intramural investigator and isn't in the normal NIH review chum bucket, so should we give a rats fart what he says?

  • jmz4gtu says:

    I think it's been mentioned before in the past on this blog, but it is really hard to tell if the intramural program scientists are any better than the extramural.
    My anecdotal observations in my field (neuronal aging) would suggest that they are not, but it could be because they don't have to promote themselves as much.

    It's not that McKnight's suggestion is inherently, bad, but the way he couches it in this assumption of his own superiority makes me instantly want to reject his proposal.

Leave a Reply