citations to individual articles and reviews in Nature Neuroscience (February-December, 2005) with download statistics from our website. Downloads represented the total PDF page views for any particular manuscript within the first 90 days of being posted online (including Advanced Online Publication (AOP) time).
Interesting. I've been pondering the potential value of article download stats for some time now so I'm intrigued by any investigation into such metrics. Perhaps this will be the start of a trend. (I will warn you in advance, however, not to expect an actual study as such out of this narrowly constrained slice of data.)
Noah Gray has more on his blog entry at Action Potential.
Everyone has their own pet problem with impact factors, whether it be with the calculation method, the non-reproducibility of the actual values, or the disagreement over what IFs really represent, just to name a few. Despite all of these concerns (and more), these numbers are typically used to rate the importance or prominence of a particular journal, and thus by proxy, the importance of the individual papers published within. This is a seriously flawed use of association (see a previous Nature Neuroscience editorial discussing this concept), leading scientists to often equate the total number of citations with scientific impact, which can be fraught with problems.
Indeed. He's singing my song here. Still, it IS fascinating. The stable of Nature journals can be reliably found to issue such breastbeating analyses while at the same time being active beneficiaries of and contributors to, this "flawed use". In the case of the flagship journal...well, let's just say it takes me awhile to stop laughing when I read these sorts of comments from people within the Nature umbrella. And even Noah, when pushed admits that rather than take a highly-downloaded paper in a low-Impact Factor journal:
So yes, I'd still shoot for the paper in the high impact journal and take my chances...
Moving along, I see that the editorial suggests a completely different motivation for this study, namely the propagation of incorrect citations.
One striking example is a study which suggested, on the basis of the propagation of citation errors, that about 80% of all references are transcribed from other reference lists rather than from the original source article. Such reports lead to the suspicion that most authors do not read the papers they cite, and that the papers that are the most cited are not necessarily the papers that are the most read. If true, this makes citation counting far less significant and calls into question the accuracy of referencing in the literature. Moreover, a practice of 'abstract citation' in lieu of reading the full article or citing papers based on reference lists is particularly problematic
I dunno. I just don't think their experimental design could really address this question. They seem to be saying that the more people read the paper the more cites it gets. No duh. But wouldn't the erroneous citations develop over a lengthy period of time? And be much less likely to occur with more recent work? Even the correlation they present between PDF downloads and citations disappears as the window for analysis goes out past 90 days post-publication.
Is this a musty old mouldering straw man they've erected? How big is this problem anyway? If erroneous citation is really rare, no correlation analysis is going to pick up anything useful, is it?
I was thinking about the ways in which my papers get cited incorrectly and the times that I notice erroneous citation of other work. Occasionally, I'll run across just plain old incorrect citations where the cited article has nothing to do with the apparent point made in the citing article. This is, however, incredibly rare. And of next to zero impact because journal searching is so excellent and readily accessible. So a flat-out blown cite is very rare and a minor annoyance. Another minor annoyance is in the lines of a more-or-less appropriate citation that is unscholarly, for want of a better term. One that is not the first, best or most appropriate citation for the point being made. But.....this is in the eye of the beholder. Hard to call it an error, really. The thing that really chaps me about bad citing practices (or, more properly I suppose, a failure to actually read the stuff you are citing) is when some concept based on a very weak dataset gets enshrined into the literature as canonical. When it is really very weak support. The problem is that then when you try to publish a much more extensive followup, you are left arguing uphill with your better data against poorer, albeit earlier data. That's annoying. Even worse is when grant reviewers don't agree one should be funded to, in part, re-do the investigations better since "we already know that". No. We. Don't! (...whoops. de-rant, DM, de-rant.)
So getting back to the point, I'm just not getting a good feel for the problem of erroneous stats. Still, good ol' Noah brings it home in the end:
We realize that this analysis is enticing at best, potentially providing a piece of an alternative solution for deciphering the impact of an individual paper. In this current scientific climate where tenure and grant funding decisions are influenced by flawed metrics like impact factor, it is important to make good use of all available technology in an attempt to realize a better system of measuring the scientific impact of any particular paper. This analysis is obviously preliminary and flawed in its own ways, but perhaps metrics such as paper downloads can find a place in a compilation of aggregated stats, painting a more accurate and informative picture of manuscript influence.
"Make good use of all available technology". Yep. When grants and people's careers are on the line, this works for me.