Much more important than the change in application format-and implemented with absolutely no community input that I am aware of-is the new study section policy that applications are discussed in the order of the preliminary scores, starting with the best and stopping when about 40% of the apps have been discussed or time runs out. This gives even more power to the assigned reviewers, as there is no longer even lip service given to the decision to triage, and no opportunity for a non-assigned reviewer to rescue an application from triage.
Prior to this new initiative, applications were reviewed by the study section in an order that did not depend on the initial priority score. This always seemed to be a good thing to me. My thinking was based on generic ideas that randomization of conditions would prevent any consistent biases related to review order. The underlying hypothesis being, on reflection, that the discussion of a given application would be influenced by the discussion of the prior application(s) and the timing within the two days allocated for discussion (Would you request that your application be reviewed at the end of the first long day?)
The new procedure is to review grants (grouped by mechanism or type) in the order of the initial priority score. CPP apparently thinks this is a bad thing.
As a reminder, research grant applications to the NIH are generally assigned to 3 reviewers who supply detailed commentary and analysis and a priority score on week before the committee meeting. The average of these three scores becomes the initial priority score. In the course of discussion (led by the assigned reviewers but with input from the rest of the ~20-30 member panel), the three reviewers may adjust their views in supplying the post-discussion scores which then define the post-discussion range. The entire panel then votes priority scores; generally within the post-discussion range but with variation permitted*. The average of the entire panel then turns into the priority score for the application.
Obviously, the opinion of the three assigned reviewers has a very influential bearing on the ultimate score of the application. There are essentially three ways in which the initial (and independent) evaluation of the three assigned reviewers can be changed from the approximate middle of the range.
First, you can have a panel which sides strongly with one reviewer over another or out and out revolts against all three of the reviewers when it comes time to vote. I have seen this latter happen, btw. Infrequently, yes, but sometimes a large number of panel members indicate they are voting outside of the post-discussion range.
Second, you can have reviewers being significantly swayed from their pre-discussion score during the course of discussion. I would hate to put any estimate on the frequency and the magnitude of swaying but it does happen fairly often. Based on my experiences on panels, anyway.
Third, you can have the reviewers being influenced during the "read" phase in the week prior to the meeting when all the initial scores and critiques are made available online. Reviewers may be influenced either by the other reviewers' critiques and scores for the application in question or by a general re-calibration through reading the criticisms and scores for other applications assigned to the panel.
I do not have any firm way to determine how often these shifts from the very initial, independent priority score of the three assigned reviewers occurred. All I can offer is my subjective experience of "quite frequently". And as to whether modification from the initial stance made the essential qualitative difference (funded, not funded)? We're not permitted to keep notes so I have to rely on memory but I'd say heck yes. In both directions. (Including in a more indirect way: i.e., one application being reviewed or scored a few score of points better which likely positioned the subsequent revision for a fundable score.)
The present drive to review the applications in the order of initial priority (best to worst) has the potential to minimize score movement**, in my view. So I agree with CPP. How so?
The SRO has to harden the review order several days in advance of the meeting so that the attached program officers can be notified of approximately when to listen in on the discussion. So any movement of the assigned reviewers during latter part of the read phase is not able to affect the ordering. Therefore if people start adopting a mindset of the review order matching the initial priority score, the other panel members are going to have an internal ranking that is not necessarily sensitive to the actual pre-discussion scores as altered (potentially) from the initial, independent scores.
The next issue gets at CPP's speculation and here I am talking out of my hat. The ordering of review makes it obvious to all where a given application being discussed stands. If you are down at about the 30th percentile of discussed apps, it is unlikely that anything you say is going to move that sucker into the fundable range. This increases the "why bother" demotivating. Once you get past about 2pm of the first day you have this demotivator correlated with the low-blood-sugar and exhaustion demotivators. I just think that the prior discussion ordering which did not pay attention to initial priority score permitted these factors to be less influential. Sure, one can have a running mental tally of scores and score ranges but this takes some mental effort. It will not be automatically clear that the range being discussed puts the present app in the 25th percentile versus the 20th percentile..perhaps critical if you think your comments may sway the panel about 5 percentile points in the good direction. A move from 25th to 20th percentile is probably meaningless but 20th to 15th may make all the difference in the world of programmatic pickups. It might make the difference for the revised application as well..even though reviewers are not supposed to anchor by the priority score of the prior version, this is hard to escape.
As CPP notes, this is one of those review things that was just put in place without much announcement or discussion. Kind of like a prior push to sharply reduce the number of assistant professors on panels. It contrasts in that way with certain other initiatives that were discussed extensively or at least announced with great fanfare and rationale. One hopes that it was based on a good reason and the limitations, such as I outline here, were considered.
*the intent to vote outside the range has to be declared.
**there have been some mutterings over the past couple of years about doing away with discussion altogether. Essentially sticking with the initial scores of a limited set of reviewers. It is not inconceivable to me that the ordered-discussion move is actually intended to minimize movement of scores through discussion.