There is no "filter problem" in science


It is your job as a scientist to read the literature, keep abreast of findings of interest and integrate this knowledge with your own work.

We have amazing tools for doing so that were not available in times past, everything gets fantastically better all the time.

If you are a PI you even have minions to help you! And colleagues! And manuscripts and grants to review which catch you up.

So I ask you, people who spout off about the "filter" problem.....

What IS the nature of this problem? How does it affect your working day?

Since most of you deploy this in the context of wanting fewer papers to be published in fewer is that better? What is supposed to disappear from your view?

The stuff that you happen not to be interested in?

32 responses so far

  • I've never even heard of this "filter" problem before now. What are your referring to?

  • potnia theron says:

    I want every jackass who says "there is a filter problem" to spend 1 month (30 days) doing it old style. Pick a new area/topic. No electronic searching. Paper journals. No PubMed. No Endnote. No PDFs. Find the articles in current journal, walk yourself backwards and forwards using xeroxed copies of literature cited. Then come talk to me about a fucking filter problem.

  • Jim Woodgett says:

    I think the volume of published material and the supposed filter issue are distinct. Agree that there is no filter or "missed critical finding" problem. The tools for searching and aggregating information today are way ahead of the days of library physical searches (I used to read the TOCs of at least 30 journals a week in the library as a grad student then queued up at the crappy photocopier - I still have filing cabinets full of the A4 sized product that I have lugged over three continents). Today, most of the relevant work comes to you at your desk! If you can't work out how to exploit the tools available for sorting the data flow, stop and ask someone. It is true that you can set your filters too narrowly and miss something that is off-field. But that was true 40 years ago and, again, there are tools to help with discovery (F1000, Twitter, etc).

    I do think we also tend to over publish simply because it is the currency of scientific progression. There is a ton of stuff published that falls right within my interest which I think adds little new. Is that a problem? No. It doesn't harm anyone. The information our retinas process every minute you're awake must be in the terabyte range. We are professional skimmers. Choosing what to pay attention to is a skill that we learn very early on. Suggesting there is too much information to deal with is like saying we are drowning in oxygen.

  • Comradde PhysioProffe says:

    Oh, this shittio? How fucken hard is to skim the TOCs of the dozen or so journals that publish everything worth looking at in your field?

  • potnia theron says:

    hand in hand with the "filter problem" is the "there is too much bad science out there" problem. and we're back to who gets to choose what is good science?

  • DrugMonkey says:

    I know PP, but people trundle it out like it is a thing. I don't get it.

  • pinus says:

    i am with cPP here. if you can't skim 20 TOCs, you need to working folks need your resources.

  • rxnm says:

    This isn't hard...everyone who says there is a filtering problem has a filter to sell you.

    Everyone who says anything else idiotic about publishing probably works for SSP.

  • Comradde PhysioProffe says:

    For you neuroscientists, just keep track of Cell, Nature, Science, Neuron, Nature Neuroscience, PLOS Biology, ELife, PNAS, Current Biology, and Journal of Neuroscience, and you're all set. Is that really too much to ask?

  • Busy says:

    "there is too much bad science out there"

    I claim that the opposite is true. I want every single study ever started available somewhere out there, waiting for the time when a user launches a search for exactly that study and can learn from whatever mistakes and conclusions come from that flawed, incomplete study.

    Back in the days of print publications, the study would have been "pushed" into the user, and hence the filter process was a real issue. In this day of google/pubmed searches we can make it so that these type of results are invisible to the average searcher and only appear as the result of highly targeted searches.

  • James Fraser says:

    I agree completely that the "filter" problem is by and large imaginary. I got in this argument with some friends of mine who started Pubchase to solve this problem. While I enjoy the recommendation engine they have built and use it to supplement my own reading, I think learning how to filter the literature and when to STOP reading a paper are essential skills for scientists. I elaborated on my workflow:

  • drugmonkey says:

    That is the clumsiest troll ever, CPP....

  • becca says:

    WTF is "Elife"? #OneOfTheseThingsJustDoesntBelong

  • AcademicLurker says:

    Cell, Nature, Science, Neuron, Nature Neuroscience, PLOS Biology, ELife, PNAS, Current Biology, and Journal of Neuroscience, and you're all set.

    Don't forget Annals of Platypus Physiology B...

  • AcademicLurker says:

    More seriously, I'd never heard of Elife until about 3 months ago, but several big shots in my (not neuroscience) field seem to like it. It looks like it's only been around since 2012. Do people think highly of it?

  • drugmonkey says:

    I claim that the opposite is true. I want every single study ever started available somewhere out there, waiting for the time when a user launches a search for exactly that study and can learn from whatever mistakes and conclusions come from that flawed, incomplete study.

    I agree that the much larger problem is not being able to access knowledge that has been generated that might help your own work, but was never published for various reasons.

  • The problem depends on the kind of research and how narrow the specialization. When I started doing my Ph.D. studies in 1959 it was the old fashioned way, in a comparatively narrow field of chemistry. I then worked as an Information Scientist (still no computers) providing an information searching and indexing service for an international company in the veterinary field. I later moved to Computer Science Research, Sinced retiring I am interested in how the brain processes information.

    If you are working in a comparatively well defines field and digging in a narrow specialist area the current online facilities are very good - and infinitely better than anything I had available 50 years ago. However if you need to stand back and get an overview covering several different disciplines you run into more and more problems the more you try to stand back from the detail as different disciplines use language in different specialist ways.

    One of the pressing modern interdisciplinary research problems is how information is processed by the human brain. Effectively there is no published model which at one end is compatible with what we know of the neurons and at the other explain the way we look at the works of Shakespeare - with educational theory, psychology, philosophy, mental illness,evolutionary pathways, the appeal of religion, etc., all having possible relevance. Such research requires "out of the box" thinking - and there is no adequate tool which lets you ask relevant questions about novel interdisciplinary links. The problem is that (1) you are not sure what terminology the authors of any similar "out of the box" research will have used (2) the field is full of wild speculation unsupported by good evidence - which you don't really want but the searches turn up, and (3) there are millions of very specialised papers and statistically a lot of these will contain the search terms you are using.

    I am currently trying to stand as far back from the details of the problem to try and allow for the fact that we, as humans, are not objective observers of "human intelligence". Where the current facilities help is when you get a lead - as the web allow you to contact authors, follow up links, etc., although pay walls can be a problem. The problem is in finding the right leads. (If you doubt me try to construct a set of search terms which will reliably come up with the answer.)

  • Dr. Noncoding Arenay says:

    "I'd never heard of Elife until about 3 months ago, but several big shots in my (not neuroscience) field seem to like it. It looks like it's only been around since 2012. Do people think highly of it?"

    I started looking at Elife 2013 onwards when several big shots whose work I keep track of started publishing in it. It struck me as odd that so many C/N/S type PIs were publishing in this year-old journal. When I Googled it I saw that it was developed by HHMI, Max Planck and Wellcome Trust. That's when I realized that the goal was to get Elife to rocket off the ground when it came time to calculate its first IF and cast it straight into the glamor club. And to support open access of course!

  • Pinko Punko says:

    ELife is uncle Howard's dump journal. Lots of good stuff in there but likely has a skewed bar because a chunk of labs that pub there have basically unlimited resources.

  • Beaker says:

    The "filter problem" is evolving rapidly. The classical model for navigating the literature is the one described by Potnia: read the TOCs (or Current Contents), then get thee to the library and photocopy. The smell of toner mixed with moldy journals; Oh, the memories. My old file cabinets are contributing to the mold spores problem.

    The current problem is one of being able to distinguish chaff from wheat. At the moment, CPP's approach still works. In the future, a time will come when the democracy of information will dictate that all sources receive equal consideration. For now, one can still ignore the scores of crap Chinese and Indian journals and not miss much, but will this be true in 20 years?

    In the future, data are never the limiting factor. The challenge is to filter, integrate, and synthesize faster than your competitors. The requirement for experimental validation still stands and this remains the ultimate limiting step for scientific progress.

  • Dr. Noncoding Arenay says:

    @Beaker - I don't see why that will change in 20 years. Crappy journals will stay crappy and no one will bother sifting, filtering or integrating their contents. Yes, a handful of new or newish journals (like ELife) may gain prominence over the decades and then we'll add them to the watch list. In the end, I think CPPs approach will always be the way to go.

  • Beaker says:

    Dr NA--your view is shortsighted. In 1980, Brain Research was king of the hill in neuroscience, Today, not so much. If your point is simply that there will always be a handful of glamour journals, I still disagree. In a world of unlimited data and sophisticated search tools, there is no inherent reason why glamour journals must remain glamourous. Filtering will become so easy and powerful that, at the very least, each paper will be judged by its merit, and ignoring certain studies based on their source will be foolhardy. The current glamor system is, at the end of the day, just a filter mechanism. With today's technology, we can do better.

  • DrugMonkey says:

    Poor Brain Research.....

  • Dave says:

    eLife is the journal of choice for hipster scientists.

  • Dr. Noncoding Arenay says:

    @Beaker - I am not talking about glam journals. I am talking about respectable journals, whatever they may be in one's field.

    I haven't been around as long as you have to observe Brain Research's trajectory of downfall, but I feel that it might be an outlier. It may have been king of neuroscience when neuroscience research hadn't taken off like it has in the past couple of decades. It is then natural that competition will be fierce among journals to attract the vast array of data out there. Whatever the reasons that caused Brain Research to stumble in this competition, something else that was able to attract more rigorous science took its place. If that happens to existing journals down the line, then sure, amend your list to reflect that.

  • Grumble says:

    There is not a "filter problem". There is a "drinking from a fire hose" problem.

    I have no trouble identifying and collecting papers that are interesting and relevant to my field, and ignoring those that are unlikely to be of interest. I do have a problem with finding the time to read the enormous and ever-growing "I should read this" stack. And I have even less time to sit back and think about the various things I've read, trying to put them together.

  • Grumble says:

    "For you neuroscientists, just keep track of Cell, Nature, Science, Neuron, Nature Neuroscience, PLOS Biology, ELife, PNAS, Current Biology, and Journal of Neuroscience, and you're all set. Is that really too much to ask?"

    No, it's too little to ask. You need to add all the specialty journals relevant to your interests. A lot of good stuff gets published there, even if it didn't meet the glam quotient.

  • dsks says:

    From what I've read about ELife's cozy relationship with PMC, the former is essentially acting as a "front" from which to allow the latter to be a publisher without actually being a publisher.

    Not that I necessarily have problem with that. In fact, I think it might be more efficient if more ELife outfits sprouted up that basically focused on editorial decisions and organizing peer review, and then simply sent the result straight to PMC and be done with it. On the other hand, it's rather absurd for ELife to be trying to maintain an impact filter - it essentially wants to replace CNS from what I can gather - when its basically sending all its pubs straight into the PMC blender with everything else anyway, which sort of reveals that it is still the brainchild of folk who can't quite allow themselves to let go of such vain nonsense.

  • toto says:

    Come on now CPP, you know full well that every single neuroscientist in here also browses the entire abstract list of Eur J Neurosci, Prog Brain Res, Brain, Neuroscience (without J), Biocybernetics, Neural Networks, and the like.

    Not to mention the Tidal Wave of Wisdom that is PLoS ONE, which is apparently what some Glam-Haterz would like to turn the entire published literature into.


  • toto says:

    OK, more seriously.

    We all know that there is often a "discrepancy" between the face value of a paper's claims, and what the experiments actually proves . We also all know that such discrepancy will usually not prevent publication - everything gets published, eventually, somewhere.

    I'm a computational guy (AKA "Mountain Dew Chugger" [tm]). For computational stuff, I can read between the lines and mentally correct the hyperbole, or spot the Missing Control, or whatever. Hell, I do it in peer review.

    But for the wet stuff, having never held a micropipette in my life, I just have to rely on the reviewers (i.e. some of you guys) to spot the Obvious Flaws for me. And I know that editors will be much more twitchy on the "reject" button, AOTBE, in the double-digit IF world than in the below-one IF world.

    So for me the "filter problem" means this: when I read the latest piece of awesome from (say) the Scanziani lab in (say) last week's Nature, I am more confident in incorporating it my totally-cutting-edge-modelz than I would be if I had read it in the Patagonian Journal of Medical Hypotheses.

    And as a side note, when I submit my red-hot papers, I know that the reviewers will be much happier to read supporting citations from Nat/Sci/NatNeur/Neuron than from Comptes Rendus du College de Pataphysique. Which I think is eminently reasonable.

    This is my opinion and I approve of it.

  • Beaker says:

    @Dr NA, after further consideration of your points, I have changed my opinion. Consider the following example:

    I still get much of my news from the NY Times, despite the incredible proliferation of alternative news sources. There are not enough hours in a day to troll through all of the online news sources and locate the occasional "undiscovered gem" news article. NYT publishes some crap, but not that much. I accept that when that occurs, I can click away the article and move on.

    The same system seems to operate with the glamour journals. One difference is that it is more difficult to spot a "snow job" article in a glamour journal unless you are very close to the topic. I came across one of those just last week in my field, but I am certain that 90% of the readership would not be able to call bullshit on this paper. There is too much information to process be able to filter it critically.

    My more meta point is that things needn't be this way. A utopian view is that each paper is considered equally, based on merit (quality of data, analysis, and rhetoric). That would be a pure scientific meritocracy. Seems like technology can help with this. On the other hand, science is a human endeavor. Your special snowflake paper might not get the recognition it deserves because politics, bias, and -isms are wrapped up in the vetting process. This fact is guaranteed for all time. Perhaps a "hit list" of 10-20 top journals is the best we can do.

Leave a Reply