In a footnote to a prior post I noted that a single grant reviewer was unlikely to have a very large impact on the fate of a specific NIH grant proposal. I've been thinking about this in terms of one of the more technical aspects of grant review as conducted by the NIH study section: voting outside of the range.
As a very brief overview, the NIH grant application is assigned to three reviewers who assess the merit, write a critique and submit a suggested score prior to the study section meeting. Permissible scores range from 1.0 (the most meritorious) to 5.0 (the least meritorious). Only the applications for which the three assigned scores average in the top X% of the entire panel's allocation for a given round are discussed at the actual meeting*. (At present the X is running about 40% in my section and from what I hear, many other normal CSR sections.) If an application is to be discussed the three reviewers start by declaring these preliminary scores, discuss the application and then finish with declaring a post-discussion score.
Following the discussion and declaration of post-discussion scores from the three reviewers, the entire panel votes a score for the application. The panel is expected to vote scores within the post-discussion range. In some cases however, other members of the panel may choose to vote a score that is either better or worse than the range. Any member of the panel is free to do so. The CSR instructs reviewers (through the SRO) that they must declare their intentions if they are going to vote outside of the range.
I've seen some variability on this within one panel. Sometimes it seems the expectation is any deviation requires a declaration. Then we were told 0.2 outside was the criterion. The point of this declaration seems to be that the CSR wants to make sure that any points that are relevant for the decision of scientific merit have been discussed openly. Makes a certain sense that you don't want part of the panel to be voting on one issue that only they have thought about. There also appears to be an error-correction role in the event that a reviewer puts an unintended score down for a given application.
In practical terms, I have seen outside-the-range declarations occur maybe 5 times per meeting max and in (to my recollection) all cases the reviewers in question felt that the issues had been raised during the discussion, they were just coming to a different conclusion regarding the merit score. I should note that I see this happening in both the more-meritorious and less-meritorious directions. Now I am hearing that there may be a new directive coming down from on high insisting on additional formality and back-checking each outside-the-range score to make sure it was actually intended and a reason supplied.
This kind of thing gets me wondering about why? What is the magnitude of the problem and what are the sources of complaint. Are there really that many unintended scores gumming up the works and having a categorical effect on the score of an application? Is this part of answering suspicions that covert reviewer behavior is torpedoing (or saving) grants?
I scared up a very brief invented score set to give me a toehold. No doubt there are fancy dummy analyses that can be run but don't look at me for that. I started with a 25 member panel in which the postdiscussion range was 1.3 - 1.5 which resulted in ten 1.3 votes, six 1.4 votes and nine 1.5 votes. This would average to 1.4, resulting in a 140 priority score on the summary statement.
Suppose one person assigns an out-of-range 180: priority goes to 141
Okay, how about four people were feeling a little more negative at 180: 144.
Sure this score difference can be critical but it really isn't that big of an effect compared with some other grant review / grant preparation issues/mistakes. Furthermore I must note that while it is not uncommon for multiple people to say they are going outside the range, I can't recall hearing more than four for a single application.
Now, how about two real jerks who take it to the triage line of 250 : 148
Much more likely to be a qualitative effect on outcome. Now I have little insight beyond myself as to how far outside reviewers will go in a situation like this but based on a feel for the panel members, I would expect this to be rare. They are not generally mad and punitive in these situations. Just feeling that the discussion and critiques did not match the eventual scores or were out of whack with other discussed applications in that round or something. This results in modest changes in assigned scores, I would think. Still, it is possible that I am too optimistic in this regard.
Harder to move in the good direction, of course. Two 100 (perfect) scores moves the average to 137, four of them to a 135. And I think it would be vanishingly rare when four 100 votes resulted from a post-discussion low end of 130!
Okay, I realize the permutations of score ranges and hypothesized out-of-range votes is enormous. And anyone is free to conduct some similar analyses with your own "plausible" scenarios. But at least for me I am satisfied that the qualitative impact of outlier votes on eventual grant disposition (funded/not funded) is not huge.
This reiterates for me that when one suspects that one voter (not an assigned reviewer, that's another question) is ruining one application's chances by a covert out-of-range vote, well, this seems unlikely to happen. Among other issues, it should be clear that CSR pays attention to out-of-range voting. This analysis does little, however, to comfort me as to why there needs to be an increased focus on this issue. Now it may be the case that there are some unusual shenanigans going on because my analysis suggests to me that it has to be pretty egregious in terms of a mini-conspiracy of like-minded voters or a very divergent score to have an effect. If so, there are some serious problems with the probity of the people reviewing. Alternately it may be that this effort is resulting more from heat (senior PI complaints) than from light (CSR reviewing data). In which case, as always, I would like to see the data before we do anything too drastic**.
*Any panel member can insist on discussion of any proposal, no matter how dismal the score. There are also some additional niceties having to do with nominating or de-nominating an application for discussion or triage/streamlining. Also, in some cases there may be a different number of reviewers assigning scores.
**I don't like the idea of increasing the social burden for voting outside the range, myself. It occurs rarely enough that I would think that there would be a hurdle, especially for the less-experienced reviewer. I mean, think about it, your first time on a panel and nobody has voted outside the range on the first dozen or two applications? Are you going to be the one? After receiving a set of severe-sounding instructions from the SRO? Ha!