# Datahound on productivity

This final figure from Datahound's post on K99/R00 recipients who have managed to win R01 funding is fascinating to me. This is a plot of individual investigators, matching their number of published papers against a weighted sum of publication. The weighting is for the number of authors on each paper as follows: "One way to correct for the influence of an increased number of authors on the number of publications is to weight each publication by 1/(number of authors) (as was suggested by a comment on Twitter). In this scenario, a paper with two authors would be worth 1/2 while a paper with 10 authors would be worth 1/10."

Doing this adjustment to the unadjusted authors/papers relationship tightens up the relationship from a correlation coefficient of 0.47 to 0.83.

Ultimately this post shows pretty emphatically that when you operate in a subfield or niche or laboratory that tends to publish papers with a lot of authors, you get more author credits. This even survives the diluting effect of dividing each paper by the number of authors on it. There are undoubtedly many implications.

I think the relationship tends to argue that increasing the author number is not a reflection of the so-called courtesy or guest authorships that seem to bother a lot of a people in science. If you get more papers produced, even when you divide by the number of authors on each paper, then this tends to suggest that authors are contributing additional science. The scatter plots even seem to show a fairly linear relationship so we can't argue that it tails off after some arbitrary cutoff of author numbers.

Another implication is for the purely personal. If we can generate more plots like this one across subfields or across PI characteristics (there may be something odd about the K99/R00 population of investigators for example), there may be a productivity line against which to compare ourselves. Do we (or the candidate) have more or fewer publications than would be predicted from the average number of authors? Does this suggest that you can identify slackers from larger labs (that happen to have a lot of pubs) and hard chargers from smaller labs (that have fewer total pubs, but excel against the expected value)?

• Dave says:

As someone who has a lot of co-authored publications, and has been criticized for it, I'm happy to see this. I think it is very important for young investigators to beef up their resume by getting involved in many projects, which often leads to lasting collaborations and grant support down the road that helps pay the bills. I have contributed to every single co-authored pub of mine and can talk intelligently about each of them. The trick, of course, is finding the right balance, and that definitely is not an easy thing for a young investigator.

• Established PI says:

I would say proceed with caution. There are likely to be rather different views of this practice depending upon your particular subfield, and you had best be familiar with the prevailing view before you invest too much of your time on c0authorships and co-coauthorships. When evaluating a CV, I am sometimes suspicious of too high a ratio of co-authorships, particularly when it is with many different groups (i.e. not a clear ongoing collaboration). There can be excellent reasons for many different co-authorships; the person may have a particular expertise or technique that is in high demand and beneficial to many projects, for example. But it can be a negative if it seems that the person in question either has no ideas of their own or has made a career out of piggybacking on other publications in order to boost their numbers.

• Geo says:

Get yourself some first authorships and have as few co-authors as possible. Publish in high impact journals.

In my field, all that really matters is first and last authorship.

Oh, and if you are first or last author, the number of other authors doesn't at all dilute your credit.

• Juan Lopez says:

If the courtesy authorship worked both ways, shouldn't the correction be 1/authors^2?

It is frustrating to read about people defending the "only first and last count", and "have as few coauthors as possible". This is what feeds the fighting and jealousy that poison collaborations and encourages abusive PI's to "forget" to include people, often students.

• drugmonkey says:

I think CPP is saying inflate middle authors all you want because they don't affect the first and last authors' credit. No need to forget anyone, the more the merrier.

I'm not "defending" anything. I'm simply stating facts about how authorship is viewed in my field when assessing scientists.

• qaz says:

I don't know what subsubfield of bunny hopping CPP lives in, but in my observation, in neuroscience (say in hiring decisions), the value of middle authorship is clearly greater than 0.

In my observation, you'll get credit for middle authorships on your CV if you have sufficient first authorships to show that you can drive too. So, for example, if you have 5 middle author papers, you've got nothing. If you've got 5 middle author papers and 2 single author papers, you've got 7 papers.

• drugmonkey says:

Agreed with the way you put it, qaz.

• If you've got 5 middle author papers and 2 single [assume you mean first- or last-] author papers, you've got 7 papers.

If you've got 5 middle author papers and 2 first- or last-author papers, then yes, you've got 7 papers. But the 2 first- or last-author papers carry a *vastly* greater weight in assessing a scientist's contributions to the field than *any* number of middle-author papers, which may carry some weight, but only in the margins. This is true for hiring, grant review, and promotion/tenure.

My attitude towards middle author papers changed once I was sufficiently established. At this point I'm known (in my little corner of science) for certain things, so when I'm on a paper with other PIs, it's pretty clear what my contribution was regardless of where my name appears.

It's a bit different when you're a starving postdoc or a struggling junior PI and need to prove that you're not a technician.

• datahound says:

DM: My post was poorly worded in some ways. The correlation coefficient between the number of publications and the average number of authors is 0.47. The correlation coefficient between the number of publications and the number of publications weighted by 1/number of authors was 0.83. It is not surprising that the correlation between the number of publications and a weighted number of publications is reasonably high.

I do agree, however, that if the sole factor for having more publications is having more authors per publication (e.g. twice as many publications each with twice as many authors), then this plot would be flat. The fact that it has a significant slope indicates that this is not the only factor.

• Anonymous says:

"Does this suggest that you can identify slackers from larger labs (that happen to have a lot of pubs) and hard chargers from smaller labs (that have fewer total pubs, but excel against the expected value)?"

Don't people kind of do this already? If you know the labs and how much they publish, don't you adjust how you view someone's record depending on whether they come from a high-publishing lab or not? Especially when there are, in general, more than 5/6 authors per paper? I know I sure do....

• rxnm says:

I agree first/last are the default and appropriate first pass at an individual's productivity, and middle author is kind of meh here.

But I'd add that, in my field, I would consider a lack of middle-author pubs to be a red flag, or at least weird enough to stand out and be discussed. You've spent ~10 years in 2-3 labs and haven't contributed (or been invited to contribute) to anyone else's projects?

• Scientopia Blogs

• DrugMonkey is an NIH-funded researcher who blogs about careerism in science. And occasionally about the science of drug use.

• Your donation helps to support the operation of Scientopia - thanks for your consideration.