Archive for the 'Conduct of Science' category

#icanhazpdf and related criminal behavior

I was slow to start watching "Better Call Saul" for various reasons. Partially because I still haven't finished "Breaking Bad", partially because I couldn't see *that* as being the spinoff character and partially because I just hadn't gotten around to it. Anyway, the show is about a lawyer who we know from BB becomes deeply involved with criminal law.

There's a point in Season 1 where one character has a heart to heart with another character about the second person's criminal act.

"You are a criminal."

He then goes on to explain that he has known good guy criminals and a bad guy cops and that at the end of a day, committing a crime makes you a criminal.

Anyway, dr24hours has some thoughts for those criminal scientists who think they are good guys for illegally sharing PDFs of published journal articles.

22 responses so far

Manuscript acceptance based on perceived capability of the laboratory

Dave asked:

I think about it primarily in the form of career stage representation, as always. I like to get reviewed by people who understand what it means to me to request multiple additional experiments, for example.

and I responded:

Are you implying that differential (perceived/assumed) capability of the laboratory to complete the additional experiments should affect paper review comments and/or acceptance at a particular journal?

I'm elevating this to a post because I think it deserves robust discussion.

I think that the assessment of whether a paper is 1) of good quality and 2) of sufficient impact/importance/pizzazz/interest/etc for the journal at hand should depend on what is in the manuscript. Acceptance should depend on the work presented, for the most part. Obviously this is were things get tricky because there is critical difference here:

This is the Justice Potter Stewart territory, of course. What is necessary to support and where lies the threshold for "I just wanna know this other stuff"? Some people have a hard time disentangling their desire to see a whole 'nother study* from their evaluation of the work at hand. I do recognize there can be legitimate disagreement around the margin but....c'mon. We know it when we see it**.

There is a further, more tactical problem with trying to determine what is or is not possible/easy/quick/cheap/reasonable/etc for one lab versus another lab. In short, your assumptions are inevitably going to be wrong. A lot. How do you know what financial pressures are on a given lab? How do you know, by extension, what career pressures are on various participants on that paper? Why do you, as an external peer reviewer, get to navigate those issues?

Again, what bearing does your assessment of the capability of the laboratory have on the data?

__
*As it happens, my lab just enjoyed a review of this nature in which the criticism was basically "I am not interested in your [several] assays, I want to see what [primary manipulation] does in my favorite assays" without any clear rationale for why our chosen approaches did not, in fact, support the main goal of the paper which was to assess the primary manipulation.

**One possible framework to consider. There are data on how many publications result from a typical NIH R01 or equivalent. The mean is somewhere around 6 papers. Interquartile range is something like 3-11. If we submit a manuscript and get a request to add an amount of work commensurate with an entire Specific Aim that I have proposed, this would appear to conflict with expectations for overall grant productivity.

26 responses so far

On sending trainees to conferences that lack gender balance

Neuroscientist Bita Moghaddam asked a very interesting question on Twitter but it didn't get much discussion yet. I thought I'd raise it up for the blog audience.

My immediate thought was that we should first talk about the R13 Support for Scientific Conferences mechanism. These are often used to provide some funding for Gordon Research Conference meetings, for the smaller society meetings and even some very small local(ish) conferences. Examples from NIDA, NIMH, NIGMS. I say first because this would seem to be the very easy case.

NIH should absolutely keep a tight eye on gender distribution of the meetings supported by such grant awards.The FOA reads, in part:

Additionally, the Conference Plan should describe strategies for:

Involving the appropriate representation of women, minorities, and persons with disabilities in the planning and implementation of, and participation in, the proposed conference.
Identifying and publicizing resources for child care and other types of family care at the conference site to allow individuals with family care responsibilities to attend.

so it is a no-brainer there, although as we know from other aspects of NIH the actual review can depart from the FOA. I don't have any experience with these mechanisms personally so I can't say how well this particular aspect is respected when it comes to awarding good (fundable) scores.

Obviously, I think any failure to address representation should be a huge demerit. Any failure to achieve representation at the same, or similar meeting ("The application should identify related conferences held on the subject during the past 3 years and describe how the proposed conference is similar to, and/or different from these."), should also be a huge demerit.

At least as far as this FOA for this scientific conference support mechanism goes, the NIH would appear to be firmly behind the idea that scientific meetings should be diverse.

By extension, we can move on to the actual question from Professor Moghaddam. Should we use the additional power of travel funds to address diversity?

Of course, right off, I think of the ACNP annual meeting because it is hands down the least diverse meeting I have ever attended. By some significant margin. Perhaps not in gender representation but hey, let us not stand only on our pet issue of representation, eh?

As far as trainees go, I think heck no. If my trainee wants to go to any particular meeting because it will help her or him in their careers, I can't say no just to advance my own agenda with respect to diversity. Like it or not, I can't expect any of them to pay any sort of price for my tender sensibilities.

Myself? Maybe. But probably not. See the aforementioned ACNP. When I attend that meeting it is because I think it will be advantageous for me, my lab or my understanding of science. I may carp and complain to certain ears that may matter about representation at the ACNP, but I'm not going on strike about it.

Other, smaller meetings? Like a GRC? I don't know. I really don't.

I thank Professor Moghaddam for making me think about it though. This is the start of a ponder for me and I hope it is for you as well.

16 responses so far

Second Thought on Glamour Pr33p P33ple

Feb 18 2016 Published by under Conduct of Science, Science Publication

If establishing the priority of scientific observations or findings is so important, another thing these people should be doing, tomorrow, is to cite conference abstracts in their papers.

It was not so long ago in the neurosciences that citations of Society for Neuroscience Abstracts would appear in archival reports.

We can return to this and it would go a long way towards documenting the chronology (I am working up an antipathy to "priority", folks) of an area of scientific work.

13 responses so far

Treat your published papers as those of your competitors

Feb 18 2016 Published by under Careerism, Conduct of Science

No scientist should *ever* be afraid to publish a finding that contradicts their prior publications.

9 responses so far

Thought on Glamour Pr33p P33ple

Feb 18 2016 Published by under Conduct of Science, Science Publication

I do not know why they don't just submit their stuff to JIF 3 journals.

Everything would be "accept, no revisions and can we get you a coffee Professor?"

All their supposed problems would be solved.

26 responses so far

Amgen continues their cherry picking on "reproducibility" agenda

Feb 05 2016 Published by under Conduct of Science, Replication, ReplicationCrisis

A report by Begley and Ellis, published in 2012, was hugely influential in fueling current interest and dismay about the lack of reproducibility in research. In their original report the authors claimed that the scientists of Amgen had been unable to replicate 47 of 53 studies.

Over the past decade, before pursuing a particular line of research, scientists (including C.G.B.) in the haematology and oncology department at the biotechnology firm Amgen in Thousand Oaks, California, tried to confirm published findings related to that work. Fifty-three papers were deemed 'landmark' studies (see 'Reproducibility of research findings'). It was acknowledged from the outset that some of the data might not hold up, because papers were deliberately selected that described something completely new, such as fresh approaches to targeting cancers or alternative clinical uses for existing therapeutics. Nevertheless, scientific findings were confirmed in only 6 (11%) cases. Even knowing the limitations of preclinical research, this was a shocking result.

Despite the limitations identified by the authors themselves, this report has taken on a life of truthy citation as if most of all biomedical science reports cannot be replicated.

I have remarked a time or two that this is ridiculous on the grounds the authors themselves recognize, i.e., a company trying to skim the very latest and greatest results for intellectual property and drug development purposes is not reflective of how science works. Also on the grounds that until we know exactly which studies and what they mean by "failed to replicate" and how hard they worked at it, there is no point in treating this as an actual result.

At first, the authors refused to say which studies or results were meant by this original population of 53.

Now we have the data! They have reported their findings! Nature announces breathlessly that Biotech giant publishes failures to confirm high-profile science.

Awesome. Right?

Well, they published three of them, anyway. Three. Out of fifty-three alleged attempts.

Are you freaking kidding me Nature? And you promote this like we're all cool now? We can trust their original allegation of 47/53 studies unreplicable?

These are the data that have turned ALL OF NIH UPSIDE DOWN WITH NEW POLICY FOR GRANT SUBMISSION!

Christ what a disaster.

I look forward to hearing from experts in the respective fields these three papers inhabit. I want to know how surprising it is to them that these forms of replication failure occurred. I want to know the quality of the replication attempts and the nature of the "failure"- was it actually failure or was it a failure to generalize in the way that would be necessary for a drug company's goals? Etc.

Oh and Amgen? I want to see the remaining 50 attempts, including the positive replications.
__

Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature. 2012 Mar 28;483(7391):531-3. doi: 10.1038/483531a.

21 responses so far

What is a scientific "observation"?

Reference to this https://t.co/hc9YYH8Myr popped up on the Twitter recently.
So what constitutes an "observation" to you?

To me, I think I'd need the usual minimum group size, say N=8, and at least two conditions or treatments to compare to each other. This could be either a between-groups or within-subject design.

22 responses so far

Only suckers pay attention to journal length limits

I can't believe I have never blogged this issue.

Obeying the alleged word or character limits for initial submission is for suckers. It puts you at a disadvantage if you shrink down your methods or figure count and the other group isn't doing that.

38 responses so far

British Journal of Pharmacology issues new experimental design standards

Dec 23 2015 Published by under Conduct of Science, Replication, ReplicationCrisis

The BJP has decided to require that manuscripts submitted for publication adhere to certain experimental design standards. The formulation can be found in Curtis et al., 2015.

Curtis MJ, Bond RA, Spina D, Ahluwalia A, Alexander SP, Giembycz MA, Gilchrist A, Hoyer D, Insel PA, Izzo AA, Lawrence AJ, MacEwan DJ, Moon LD, Wonnacott S, Weston AH, McGrath JC. Experimental design and analysis and their reporting: new guidance for publication in BJP. Br J Pharmacol. 2015 Jul;172(14):3461-71. doi: 10.1111/bph.12856 [PubMed]

Some of this continues the "huh?" response of this behavioral pharmacologist who publishes in a fair number of similar journals. In other words, YHN is astonished this stuff is not just a default part of the editorial decision making at BJP in the first place. The items that jump out at me include the following (paraphrased):

2. You should shoot for a group size of N=5 or above and if you have fewer you need to do some explaining.
3. Groups less than 20 should be of equal size and if there is variation from equal sample sizes this needs to be explained. Particularly for exclusions or unintended loss of subjects.
4. Subjects should be randomized to groups and treatment order should be randomized.
6.-8. Normalization and transformation should be well justified and follow acceptable practices (e.g., you can't compare a treatment group to the normalization control that now has no variance because of this process).
9. Don't confuse analytical replicates with experimental replicates in conducting analysis.

Again, these are the "no duh!" issues in my world. Sticky peer review issues quite often revolve around people trying to get away with violating one or other of these things. At the very least reviewers want justification in the paper, which is a constant theme in these BJP principles.

The first item is a pain in the butt but not much more than make-work.

1. Experimental design should be subjected to ‘a priori power analysis’....latter requires an a priori sample size calculation that should be included in Methods and should include alpha, power and effect size.

Of course, the trouble with power analysis is that it depends intimately on the source of your estimates for effect size- generally pilot or prior experiments. But you can select basically whatever you want as your assumption of effect size to demonstrate a range of sample sizes as acceptable. Also, you can select whatever level of power you like, within reasonable bounds along the continuum from "Good" to "Overwhelming". I don't think there are very clear and consistent guidelines here.

The fifth one is also going to be tricky, in my view.

Assignment of subjects/preparations to groups, data recording and data analysis should be blinded to the operator and analyst unless a valid scientific justification is provided for not doing so. If it is impossible to blind the operator, for technical reasons, the data analysis can and should be blinded.

I just don't see how this is practical with a limited number of people running experiments in a laboratory. There are places this is acutely important- such as when human judgement/scoring measures are the essential data. Sure. And we could all stand to do with a reminder to blind a little more and a little more completely. But this has disaster written all over it. Some peers doing essentially the same assay are going to disagree over what is necessary and "impossible" and what is valid scientific justification.

The next one is a big win for YHN. I endorse this. I find the practice of reporting any p value other than your lowest threshold to be intellectually dishonest*.


10. When comparing groups, a level of probability (P) deemed to constitute the threshold for statistical significance should be defined in Methods, and not varied later in Results (by presentation of multiple levels of significance). Thus, ordinarily P < 0.05 should be used throughout a paper to denote statistically significant differences between groups.

I'm going to be very interested to see how the community of BJP accepts* this.

Finally, a curiosity.

11. After analysis of variance post hoc tests may be run only if F achieves the necessary level of statistical significance (i.e. P < 0.05) and there is no significant variance in homogeneity.

People run post-hocs after a failure to find a significant main effect on the ANOVA? Seriously? Or are we talking about whether one should run all possible comparison post-hocs in the absence of an interaction? (seriously, when is the last time you saw a marginal-mean post-hoc used?) And isn't this just going to herald the return of the pre-planned comparison strategy**?

Anyway I guess I'm saying Kudos to BJP for putting down their marker on these design and reporting issues. Sure I thought many of these were already the necessary standards. But clearly there are a lot of people skirting around many of these in publications, specifically in BJP***. This new requirement will stiffen the spine of reviewers and editors alike.

__
*N.b. I gave up my personal jihad on this many years ago after getting exactly zero traction in my scientific community. I.e., I had constant fights with reviewers over why my p values were all "suspiciously" p<0.5 and no backup from editors when I tried to slip this concept into reviews. **I think this is possibly a good thing. ***A little birdy who should know claimed that at least one AE resigned or was booted because they were not down with all of these new requirements.

39 responses so far

« Newer posts Older posts »