A question was recently posted to my prior post that pointed out the disingenuous arguments made by the NIH in the wake of Hoppe et al. (2019). There are data in the supplement to Hoppe et alia that show clearly that applications with Black PIs fare worse across all five topic-cluster quintiles ranked on success rate. So the implication that the disparity of funding to applications with Black PIs would disappear if they just applied in equal proportion within the more successful topics is falsified by their own data (see Table S6).
aspirational skunk asks:
Fig S6 of Hoppe et al. (2019) seems to show that WH PIs experience funding rate advantages across all topics, be they “AA/B preferred” or not. But then the “re-analysis” by the Lauer et al. (2021) topic choice paper writes that, well aschully, “peer review outcomes were similar” and that presumably Fig S6 can be explained away by ICs with lower award rates.
I do not understand Lauer et al (2021)’s argument. Doesn’t Fig S6 indicate that AA/B PIs experience funding rate disadvantages *within ICs* (if topic cluster is a good enough proxy for ICs)? So the fact that ICs with more AA/B applicants have lower funding rates is kinda besides the point? I’ve been looking, but isn’t it the case that Lauer et al. (2021) doesn’t show within IC funding rates for AA/B and WH PIs? My generous takeaway from the Lauer et al. (2021) paper is that yes, disaggregating statistics is important for isolating the largest IC contributions to disparities. Am I missing something here?
My answer was that is really hard for me to work through the various defenses and excuses of the NIH. As an example, they used some inferential statistics to show there was “no difference” in exception pay behavior in Hoppe et al. My calculation suggests that about half again as many applications were funded to white PIs at 35%ile and worse scores, as were funded to Black PIs, all at 34%ile or better. They say “look, exception pay decisions are not the issue” based on the lack of significance in their inferential analysis. To me it is overwhelmingly obvious that they are doing a serious red herring with stats to arive at this conclusion.
I agree with the comment that the S6 data show very clearly that Black PI apps are at a disadvantage no matter *which* topic-success quintile they fall into.
Lauer et al 2021 is also a little funny because it focuses on all of the applications in “AAB preferred topics”. Most of the PIs of those applications are not AAB, of course. It’s just that of the AAB PI apps, there is a preference for certain topics. By “preference” they mean that half of AAB PI applications map to 10% of the topic clusters. Which of course also means that half do not. The analysis also focuses on AAB “preferred” Institutes or Centers, which really means ICs that NIH prefers to assign those applications to. Remember, PIs can ask for assignment, and in some cases the FOA makes it inevitable, but on the whole the NIH itself is the one deciding which I or C to assign, or not assign, as a funder. If you look at Table 1, the AAB (%) column is telling. NINR gets 4.66% of their apps with AAB PIs and NIMHD gets 14.84%. After that, their cutoff where to decide on AAB “preference” ICs is entirely arbitrary.
To make a point, I’ve taken the liberty of graphing the data from Table 1. The red squares are the ICs used in the Lauer analysis as “AAB preferred” ICs. The blue triangles are a selection of ICs which I’ve arbitrarily identified for reasons discussed below.
Beyond these two, the next three included in their analysis (Fogarty is excluded) are NIDCR 2.03%, NIAID 2.08% and NICHD 3.08% of applications with AAB PIs. The next most-preferred are NIEHS 2.01%, NIDA 1.83%, NIMH 1.72%, NIDDK 1.57%, NHLBI 1.51%. It is clear that NIMHD and NINHR are very far outliers and that the remaining included ICs group much better with a host of other ones in the middle of the distribution. Particularly NIEHS (asthma? lead paint? smog?), NIDA (substance use by race?) and NIMH (see Harnett, 2020), and possibly NIDDK (hello metabolic disorders in Black folks) and NIHLB (heart attack, ditto).
So why did they draw the cutoff where they did? “Table 2 shows review and funding outcomes for applications according to whether the assignment was to an IC in the top quartile of proportion of applications with AAB PIs“. An analysis choice which is simply stated, unexplained and undefended. As a reminder this is a catchment that accounts for more than a quarter of the applications with Black PIs, something around 28% by my rough count. Yet the topic analysis differs slightly “We designted (sic) a topic as ’AAB Preferred’ if it was among the 15 topics that accounted for 50% of AAB applications.” Why? That comes from 10% of the topics, whereas if they’d used 25% of the most-preferred AAB topics they would have captured 70% of the applications with AAB PIs. Why did they make these decisions? What impact would the use of other arbitrary cutoffs have for the overall conclusions? who knows???
It sure smells like p-hacking to me.
After that, Lauer et al 2021 report that for the total sample of apps (white and AAB PIS) in the “AAB preferred topics”, the merit rankings are not different from other topics, but that they are less likely to be funded. From there it is another jump to show a disproportionate assignment of the total sample of apps in these topic domains to the poorly-funded ICs. Although of course Lauer talks about success rates at those ICs, chalks it up to “funding ecology” and does not address the fact that NIMHD and NINR get low amounts of the NIH allocation and that is the real problem. For grins, I’ve also graphed the IC %AAB apps against the IC’s percent of the extramural allocation from Fiscal Year 2015 (application sample is 2011-2015), as provided in Table 1 by Lauer et al. NIMHD gets 0.99% and NINR gets 0.52% of the budget, NCI gets 18.3% and NIAID gets 16.1% by these numbers.

After that the paper gets into the weeds of PI race by referring to a change in the value of the regression coefficient in their probit model in Table 5. I mean…come on. There is nowhere that just shows us success rates for AAB and white PI apps within IC. Or scores for each category within topics. It all hides behind fancy statistics. Take Tables 3 and 4. They would be the ideal place to give aggregate statistics by PI race. They do not.
Thanks drugmonkey for your quick and thorough answer to my question. Really helpful.
“There is nowhere that just shows us success rates for AAB and white PI apps within IC. Or scores for each category within topics. It all hides behind fancy statistics. Take Tables 3 and 4. They would be the ideal place to give aggregate statistics by PI race. They do not.” How disappointing. Presumably, it would have been easier to report these numbers, rather than go through this tortured exercise of analyzing outcomes by topic cluster. Begs the question of why they didn’t go the easier, more transparent route here…
I’m okay with stacking up fancier and fancier analyses, in principle. But you have to start with reporting basic descriptive statistics that are highly relevant to the most pressing questions. Simple, easily understood statistics. And since the entire preamble leading up to this is the analysis of success rates of applications, by PI, well it is weird they don’t start with that presentation here as well.
What’s it going to take for NIH to release those basic descriptive stats? The latest stuff seems to be funding rates per applicant. It’s weird that they don’t also put out data for success rates by proposal. Makes me wonder if the “funding rates per applicant” charts happen to make the gaps look smaller than the “success rates of applications.” Kinda sus if you ask me.
I agree that when they substitute a fancier analysis for a simpler one, or when they all of a sudden stop reporting one measure and replace it with another, it raises suspicions. Given their now very long history of trying to put the most NIH-defending spin on every single piece of data…well I think the null hypothesis has to be they are hiding something.
How about some folks with sufficient career capital team up and make a few FOIA requests for data by IC, and maybe also by specific award mechanism? There’s already precedent for this… see Rescuing Biomedical Research’s FOIA request for K99 awards: https://web.archive.org/web/20180723171128/http://rescuingbiomedicalresearch.org/blog/examining-distribution-k99r00-awards-race/
Or get an investigative journalist to do it. Writing a FOIA request is not complicated. It’s also more-or-less free.
Why is it always “folks with sufficient career capital”? Do they think there is some magic threshold where a PI can poke the bear without fear of detrimental consequence to their own chances?
Didn’t mean to suggest that there were no risks. There definitely are. The fears are valid. I wanted to emphasize “folks” as in plural. If there’s a group effort, maybe it helps offset the costs. I agree, it would be better for an outsider with little to lose to do it. But I would hardly expect a student to make the request… are you saying there is no threshold where the level of risk is acceptable? I think that’s going to be personal for each PI, and my guess is that those higher up the chain may feel more comfortable throwing their weight around. Hence “sufficient career capital.” What sufficient means is personal to each person.
I just think people should chime in with whatever level of engagement occurs to them and they are comfortable doing. There is far too much waiting around for someone else to speak up and often it is with this slant of “well YOU [don’t risk anything. have more power to effect change. are more eloquent]”. as per your guess about those you perceive as “higher up the chain”. And we lose the most powerful effector of change, which is a LOT of people all saying more or less the same thing. Apathy is taken by the powers that be to be endorsement. a lone voice of complaint is dismissed as a nutter.
I agree that everyone has their own limits and that’s okay. I agree that collective action by many is more powerful than the actions of a few, and that small numbers can be easily dismissed as being nutters, outliers, extremists, difficult people…
Where you lose me: If it has to happen from within the house, shouldn’t the people with the most career stability and clout lead the way, because they have the least to lose amongst everyone else? Help me understand.
In case it wasn’t clear, I wasn’t asking *you* to do anything. I’m posting here because everyone and their mother reads your blog. World knows you’ve done more than enough by blogging all these years. Posting here is my way of planting seeds, in case anyone feels like doing something about it. (Or getting somebody else to do something about it.)
Maybe I’m misreading your replies, but I feel like I poked something that I didn’t mean to. Sorry.
In case it wasn’t clear, I wasn’t asking *you* to do anything.
I occasionally field comments / criticisms that I should do some action or other, frequently with the implication that my online ranting is insufficient. I note that this is from people who have little knowledge of what I may or may not do or have done IRL. Nevertheless it is a common implication that someone else “should” be doing X, Y or Z. So this may color my response to your points.
If it has to happen from within the house, shouldn’t the people with the most career stability and clout lead the way, because they have the least to lose amongst everyone else? Help me understand.
I just am not behind the idea that anyone can say what others “should” do because of perceptions they “have the least to lose”. Everyone thinks other people can afford something they cannot. Everyone thinks that other person has less risk then themselves. Everyone undervalues the risks and costs to others compared with themselves.
Of course I wish that many, many more people would speak up about the things that I think are important in academics, in science, in our NIH-funded world. Of course I think that there are people who would have greater impact than I for saying the same things. Of course I think that there are people who face less career consequence for saying the things that I do. But I do not have the arrogance to assert their lack of risks for them.
And I am convinced that part of the reason people do not speak up is related to these notions. That someone else is better situated to have effect, so why bother. Someone else has no risk, so why bother.
I occasionally field comments / criticisms that I should do some action or other, frequently with the implication that my online ranting is insufficient. I note that this is from people who have little knowledge of what I may or may not do or have done IRL. Nevertheless it is a common implication that someone else “should” be doing X, Y or Z. So this may color my response to your points.
That royally sucks that folks do that. I’m sorry. That’s not what I ever meant to say or imply.
Of course I wish that many, many more people would speak up about the things that I think are important in academics, in science, in our NIH-funded world. Of course I think that there are people who would have greater impact than I for saying the same things. Of course I think that there are people who face less career consequence for saying the things that I do. But I do not have the arrogance to assert their lack of risks for them.
Thanks drugmonkey. I agree with you mostly. I guess I come from the perspective of a student who feels repeatedly let down and disillusioned by faculty and other presumptive leaders of our discipline. “Wait until you get tenure” is always the advice when it comes to agitating for change. Yet I rarely see anyone with tenure actually using their status to do anything that would improve the material and working conditions of those less secure. (Obviously, you’re an exception.) If anything, hiring, promotion, and tenure processes only appear weed out those people, while keeping around those who benefit from a culture of hypercompetition and individualism. So I’m frustrated. If it’s arrogant to feel that it shouldn’t rest on students like me and others to drive change in our institutions, then okay. I take your point that on an individual level, comparing the relative risks for students/ECRs versus tenured faculty is not always black and white, but I’m speaking generally here.