In today’s edition of “I look up something on Google Scholar, read a highly cited paper in a good journal and it sucks”: Bolla et. al., “Dose-related neurocognitive effects of marijuana use”
- n = 22
- binned arbitrarily (?) into three groups of sizes 7, 8 and 7
- mean and std dev of sample are reported, but no histograms, no way to tell if the binning was at all natural
- researchers are trying to look at the effects of marijuana use on cognition, but this is confounded because the people in their sample who used more marijuana had lower IQs; to deal with this, they regress all 35 of their cognitive tests on measured IQ, and subtract out the effect of IQ
- that is, they do an IQ test, and then they do a bunch of other cognitive tests which are presumably correlated with IQ (some of them are literally taken from IQ test batteries other than the one they used)
- so their variables are cognitive tests, controlled for IQ – which is itself the sum of a bunch of other cognitive tests
- no principled distinction (as far as I can tell) between the “IQ” cognitive tests and the other ones, e.g. they note approvingly that their IQ battery is correlated with the WAIS-R (r=0.79), then include a test from the WAIS-R among their “other” / “non-IQ” tests
- no controlling for multiple comparisons
- that is, they plugged 3 (arbitrary) tiny groups into an ANOVA with 35 dependent variables, and judged comparisons significant each time they had (uncorrected) p < .05 on a post-hoc t-test
- they found 14 significant comparisons (out of 105 total, 3 pairs times 35 variables)
- there may be statistical reasons that it’s not just 105 raw comparisons, I’m not sure, but in any case, it’s hard to say this wouldn’t happen by chance when we’re so far from asymptotics (we’re comparing groups of sizes 7, 8 and 7)
- most of the significant results were .01 < p < .05 (they marked p < .01 separately)
- the reviewers also explored nonlinear effects, finding some significant ones (no report of how many tests were done total)
- the authors include two figures (“A” and “B”) to illustrate how IQ and marijuana consumption interact; these have some really weird features which are presumably due to the small sample size
- like in figure A (Repetition of Numbers Task), for the higher IQ group, performance goes up and then down (it’s best at the “medium” consumption level)
- while in figure B (Stroop test), the higher IQ group does monotonically better with increasing consumption, with the amusing result that if you take the plot literally, the way to do best on a Stroop test is to have a high IQ and also smoke 94 joints per week
- wait what does that even mean though, like are these people literally hand-rolling more than 90 individual marijuana cigarettes every single week of their lives??
- like I’m assuming “joints / wk” is an established technical measure that they can convert to, if people aren’t smoking literal joints, right?
- the authors assessed this quantity by using a questionnaire called the “DUSQ,” citing a text called “Addictive drug survey manual” by S.S. Smith, which seems to only appear in citations, my university library doesn’t have it, Google Books doesn’t have it
- trying to look up the “joints / wk” or “joints / day” concept in the literature leads to gems like this paper from 2011, which tried to actually empirically determine the conversion rates between joints and other marijuana consumption units, and found that they were wildly different from the ones assumed (on what basis, one wonders) in another standard questionnaire (not the one used in the paper under discussion, which again is inaccessible to mere mortals)
- I give up
- paper has been cited 534 times
- paper was published in Neurology, which from a cursory glance appears to be the premier journal for, well, neurology



