Feb 10 2012 Published by drugmonkey under Day in the life of DrugMonkey
Do you cite/list the statistics package you use for analysis down to the version in every paper?
The reason I ask can be found here.
24 responses so far
Yes. A package had an embarrassing glitch when I was a fellow, so I have generally included that info.
Not for stats, yes for simulations. Simulations also involve making code available.
I recall some issues with SPSS regarding a default setting that was counter-intuitive and might have lead to some misreported stats. I don't recall the details, my SPSS-hatin stats teacher always mentioned it. She managed to convert me to R eventually.
OMG. *checks to make sure I have the recent version...phew*
Yikes! And Yes.
Yes. For exactly that reason.
Sorta. We use OriginPro8, which we state, but I don't state that it's 188.8.131.528
hey, thanks for the tip on OriginPro....think I'll check that out. always looking for convenient, user friendly stat packages...
OriginPro is not very sophisticated for statistical analysis, but it is a pretty good graphing dealio.
I don't need "sophisticated". I need a few key analyses and after that...simple is better.
In all honesty, I'm an order of magnitude or two more likely to make a my own error using a package than to stumble over an incorrect calculation hidden within the package. So my defensive measures have to be able to help me track down my own errors, before I worry too hard about protecting me against bad vendor software.
So, I write using R and Sweave. Every P-value in the paper is directly traceable back to raw data and the command that was used to calculate it. Incidentally, this also gives insurance against the kind of vendor bugs cited above.
Yes, and thanks for the post, as I use prism quite a bit......
Yes - I cite the product and version.
(I tried to answer earlier, but I guess the interweb ate it.)
Used to be common to cite what kind of computer you used for the analysis, too - remember the Pentium floating point processor bug?
That said, just to be contrary to what seems to be the prevailing opinion, NO, I don't make a point of citing the stats software. First, usually nobody gives a flying fuck. Second, this kind of error is really uncommon. Third, if I draw an incorrect conclusion because of this kind of error, is citing the software package somehow going to save my ass? And fourth, if my conclusions were correct despite such an error, some perseverative persnickety dumbshit might still question whether the results are genuine. I don't need idiot grant reviewers to have YET ANOTHER moronic excuse for questioning the entire foundation on which my proposed research is built.
This is why I don't use stats packages - R is the way to go.
Another problem with GUI stats packages is that you can't really be sure you can replicate your analysis in the future. Unless you copy down all the options you've checked off, and the exact setup you used, you'll never really know what you did. In contrast, SAS is a command-line program, so you have an audit trail of your work.
Yes, but I'm an epidemiologist and we care about that shit a lot.
Also, I understand why people use R/S and SAS, but I love Stata and will never break up with it, because it's a stats-package that can be as transparent as you want it to be. It's not that graphically complex, but I almost never need complex figures.
If you're doing things that involve potential legal liability, it's a good idea to take a close look at the licenses on the software. Most programs that provide a paid license make no promises as to the accuracy of their results, or at the very least talk about not being used for air traffic control, nuclear power plants, medical technology, and their ilk. (I leave out open-source software, since a disclaimer of fitness for any purpose is almost universal there). The point of this being that unless things have changed lately, SAS doesn't come with this disclaimer of liability. They stand behind the mathematical correctness of their software, which is one of the reasons that it remains so widely used in the pharmaceutical industry. It's expensive as s**t if you aren't an academic on a good site license agreement, but if you want to be morally certain that you are getting the right numbers out, SAS is the way to go.
MG: And you know that how?
It's not the package or version that is the problem, it's that folks describe the model and statistical test (and cough up the actual data) like my ass chews gum. If reader can't understand the important part, who cares about package or version numbers.
" I collected some data and used a package. I hope I did it right."
Yes, and I use SAS.
Of course, there was a study done a few years back that analyzed hundreds of statistical analyses in hundreds of papers and came up with a very scary statistic of their own ... that a large number of papers use the wrong statistical analysis ... which renders the who discussion above moot (if true).
DrugMonkey is an NIH-funded researcher who blogs about careerism in science. And occasionally about the science of drug use.
Site Admin | Theme by Niyaz
Drugmonkey Copyright © 2018 All Rights Reserved