I've been meaning to pick up on a comment made by a reader over at writedit's epic thread on NIH paylines, scores and whatall. (If you want to swap war stories and score/IC payline grumbling, that is the hot place in town.) The guy was ticked off about a recent review he received and had a question:
I am an establishe investigator. I subnitted a competing renewal ... I got a score of 40 (37 percentile). I was very shocked and dissapointed to find out that my application had a preliminary score of 2.7 (which would have been fundable) but it seems one negative reviewer carried the day, and convinced others to pull down the score. I have not yet seen the comments, but if the comments have factual errors, especially from the negative errors, can I appeal the review and request a re-review?
Recently, as luck would have it, a loyal reader of the blog submitted the following scores, received on the review of her R01 grant proposal. Under the new scoring procedures in place since last June, these are scores which each reviewer suggests for criteria of Significance, Investigator, Innovation, Approach and Environment. I may have slightly re-ordered specific scores for concealment purposes but this is essentially the flavor.
It really is always Reviewer #3, isn't it?
Although we have more detail in the second case, let us credit the first person's description of events, leading to more-or-less equivalent scenarios. The appearance that two of three reviewers loved the proposal a whole lot and the third managed to torpedo it.
My first response to the specific scores would be "Congratulations! You must have written a pretty good proposal!" if you managed to get someone throwing down the 1 scores with a 2 tossed out so they don't look like a total homer- that's good stuff. You have an advocate like that pulling for your app and it is hard to make the case you got a raw deal. The way I'm looking at scores these days, the 2s and 3s of the next-fondest reviewer are pretty schweet too. This is my point about the strong advocate- it could be that this next-fondest person is just looking for a reason to improve the scores. After all, he/she could be sitting on an app that for random reasons he/she felt was better and was trying to spread the scores out. Absent a strong advocate, you are stuck with 2-3 range. If another reviewer is pressing for better, you might just get two pulling toward the 1-2 range. That, in my current understanding, being the entry card for a shot at a fundable score.
But...dum..da..dum..dum what's up with that third reviewer?
4th, 3rd, tomato tomahtoCalm down, calm down. Those scores aren't all that bad really. The allowable range goes up to 9 and there are only 1-2 pt gaps across the ordered reviewers. So the score disparity isn't huge. Functionally, these scores can mean all the difference in the world of course. Unless the odd reviewer finds that s/he just completely misread the app in some particular, no way s/he is going to be talked down below a 3 on those scores.
You will note at this point that I am assuming that the preliminary overall scores are somewhat related to the individual scores. There is not supposed to be any specific relationship, which is a topic of rant-inducing proportion for another day. Suffice it to say that to a first approximation I am comfortable making the leap. So the initial preliminary overall score identified by these reviewers were probably 1 or 2 (2 unlikely), 2 or 3 (about equi-likely) and a 3 or 4. (Here I am making the further assumption that the scores were not substantially edited after the meeting.) What are the possible post-discussion scenarios and voting outcomes?
Well, it could have stayed similar to initial. Perhaps the advocate talked the middle one down so you ended up with 1, 2, 4 or even 1, 1, 4. Maybe the middle or bad one talked the advocate up. So it was 2,3,4 or worse. And then we have to make assumptions about which reviewer was most convincing to the mean of the panel itself. Some 20 more reviewers would be voting, typically within the post-discussion range. Did they lean to the good side, minimizing the contribution of the detested Reviewer #3? Or did they lean towards spiking the app? Were they split?
Getting back to the comment waaaay up at the top of the post, it is generally ridiculous to claim that one reviewer ruined your chances of funding in a way that is unfair or shows that the system is broken. After all, I don't ever (and I mean ever) hear anyone claiming the system is broken or screaming about appeal because of receiving an outlying score in the favorable direction.
Since you've been so patient, the reader was kind enough to relate the app ended up with a 26 priority score (i.e., the vote averaged 2.6). So looks like the panel voted smack dab in the middle of the range identified in the reviewer's original critiques. But so what? If it had ended up toward a 3.5 or so, we would only conclude that the "bad" reviewer was convincing to a whole panel of people. That is no flaw in the system. And if it had trended more toward a 2.1 or so? Everyone would be jumping around high-fiving each other. Except that one outlying reviewer perhaps. But s/he shouldn't be grousing about the system either.
Final thought on reading the tea leaves. There will be additional clues in the "resume" section of the summary statement if your SRO is any good. Best case scenario is that a single issue, maybe two, identifies the core problem. Worst case, there are several things mentioned without a lot of clarity. But this probably means that the panel did not coalesce into a single viewpoint. Sometimes this difference can help you decide what to spend the most time on during your revision process.