"It's better to light a candle than curse the darkness"

A Layperson’s Guide to the Scientific Literature – Part 2

June 2nd, 2008

Results:

Moving right along, we come to the “Results” part of the article. This is where the rubber meets the road. If the paper doesn’t have what it takes in this section, it’s just a bunch of words.

Now, you’d think that this section would be pretty straightforward. After all, it’s just a matter of putting the data down on paper….right? Well, part of the challenge is showing the data in a way that makes it easy for readers to see the point(s) the authors will be making later without being deceptive. That’s a pretty fine line the authors need to walk.

Some can’t do it.

Some don’t even try.

A good results section doesn’t simply give a bare table of numbers but it also doesn’t try to hide the data behind flashy graphs or deceptive statistical analyses. The purpose of the results section – in a literary sense – is to “set the stage” for the subsequent “Conclusions/Discussion” section(s). If it’s done well, the reader will already know what the conclusions are when they read the next section.

When you’re reading the results, look for the data. Somewhere in this section, there should be the actual data. If the study measured mercury levels or counted shrews or weighed tortillas, there should be a table or graph somewhere in the section that shows that data. In studies of humans or other natural populations (”natural” being a relative term), there should also be a table or chart comparing the “demographics” of the two samples – age, sex, weight or other variables that are relevant to the comparison being made.

The first thing to look for is the demographic information of the two (or more) samples. The samples should have close to the same mean (”average”) age, the same proportion of males and females, etc. In human studies, economic status and race or ethnic background may also be highly relevant. The groups should also have roughly the same number of members, although it is often acceptable to have a larger number in the “control” group.

Inspecting the group demographics can be very revealing. A mismatch between control and study groups can potentially invalidate an entire study. Check it first.

Next, look at the study results. There are a couple of questions you should ask the data – and it should be able to answer them:

[1] Is there a difference?

[2] If there is a difference, is it statistically significant? The authors should provide this information.

[3] If there is a difference - or if there is not a difference - is that relevant? In other words, does it pass the “So what?” test.

[4] What do the data tell YOU? Before you read the authors’ conclusions, make some of your own. (This is why I suggested that you NOT read the abstract).

The last one is especially important. One of the most common errors I see in articles – especially those by authors who have an emotional investment in a particular conclusion – is to make conclusions that are not supported by the data. Most of the time, this consists of going beyond what the data can support. Occasionally, the conclusions aren’t even related to the data.

Make a note of your conclusions and compare them to the authors’ conclusions. If you can’t come to any conclusions about the data, make a note of that, too. If you can’t come to a conclusion about the data, you may have a hard time evaluating whether the authors’ conclusions are reasonable, and you need to know that.

Conclusion/Discussion:

In some papers – and in some journals – there are separate “Conclusion” and “Discussion” sections. In others, there is either one or the other. The differences are subtle (some might interpret that as “non-existent”) and – in the end – not particularly important.

This is where the authors explain what the data mean to them. And this is where you compare your conclusions to theirs. If you disagree with the authors, it doesn’t necessarily mean that they are wrong (or that you are wrong) – that’s just more data for you to consider.

READ the conclusions and/or discussion sections. Then read them again. A study that is well-designed and well-executed needs only a very simple exposition of the conclusions. In the best studies, the conclusions are anticlimactic because the data make them self-evident. In most studies, however, the conclusions need a little explaining.

While you’re reading the conclusions, look for what I call the deus ex machina ploy. In fictional literature, this means resolving a difficult plot situation by introducing an element that is unrelated to the rest of the story - “Suddenly her great-uncle’s attorney arrived with the news that she was now a wealthy heiress.”, or things along that line. In the scientific literature, it means pretty much the same thing.

In scientific literature, every piece of data used in the article needs to be documented. Data from other studies can be introduced to support the conclusion(s), but those studies need to be cited. When authors suddenly introduce “undocumented” information to make the conclusion they want (when their data don’t support that conclusion), they are using the deus ex machina.

A good example of this is the infamous Holmes et al (2003) “Baby Haircut” study. In this poorly-designed and poorly-executed study, the authors found that autistic children had less mercury in their hair (saved by their parents from their first haircut) than did non-autistic children. They then used a mixed bag of historical data (vaccinations, maternal amalgam dental restorations, fish consumption, etc.) to concoct a formula to “predict” what the hair mercury “should” have been. They then claimed that this “showed” that the autistic children had less mercury in their hair than they “should” have had.

Using the procedure outlined above, the “data” are:

[1] Autistic children had less mercury in their hair – at the time of their first haircut – than their non-autistic peers.

[2] The formula the authors derived from historical data did not predict the amount of mercury found in the autistic children’s hair.

Since the formula was derivative data – it was generated by using a combination of data of different reliability – it is only as reliable as the least reliable data used to generate it. Given the difficulty of recalling matters of diet many years after the fact, the formula is a highly questionable bit of “data”.

So, in the end, the only thing that could be said from this study was that the autistic children had less mercury in their hair than the non-autistic controls. There are a few possible conclusions that can reasonably be drawn from these data:

[1] The autistic children were exposed to less mercury than the control children.

This would result in the same findings, although it wouldn’t be nearly as interesting.

[2] Autism causes less mercury to be absorbed into the body.

Hair, by the way, is not a significant excretion pathway for mercury in humans - especially infant humans (hint: babies don’t have much hair - neither do adults, compared to other mammals). Technically, hair doesn’t excrete mercury at all - it simply absorbs it passively from the blood. More mercury in the blood = more mercury in the hair; less mercury in the blood = less mercury in the hair. This has been shown over and over in almost every fur-bearing species (including Homo sapiens).

[3] Mercury protects against autism.

This one seems pretty ridiculous, but there is no data showing that mercury - at low doses - doesn’t protect children from autism, just as there is no data showing that mercury at the same doses can cause autism. The “absence of evidence isn’t evidence of absence” sword cuts both ways. 

[4] The results were anomalous. 

It may have been a matter of random chance or a laboratory screw-up, but the results may have simply been invalid. Given the fact that the “control” group’s mean hair mercury level was over 16 times that found in a large national study, it seems reasonable to suspect a massive lab error.

Personally, I prefer option [4]. Again, since the control children’s hair mercury was over 16 times higher than that found in a large (huge, actually) national study less than a year later (McDowell et al, 2004), it seems like the simplest conclusion. Unless we postulate that the non-autistic children were eating nothing but tuna and swordfish prior to their first haircut, it’s hard to imagine how they would have ended up with so much mercury in their hair.

However the authors didn’t pick any of those conclusions. Their conclusion was a classic deus ex machina. They claimed that the relatively lower hair mercury in the autistic children (it was actually twice the mean hair mercury McDowell et al found in children 1-5 years old) was a result of an impairment of mercury excretion.

Mind you, their study did not look at mercury excretion – nor could it, since they were looking at stored hair specimens that were at least a year old. They also had no citations of studies showing that autistic children had reduced mercury excretion. And they had no data or citations to support the idea that impaired excretion of mercury would lead to low hair mercury levels (more about that later). 

Apparently, they just made it up.

In fact, the idea doesn’t even make sense. For starters, hair doesn’t actively excrete mercury - the mercury moves passively from the blood to the hair - so there’s nothing there to impair. Secondly, poor excretion of mercury would lead to an elevated blood mercury level (which Holmes et al claimed was causing autism) and therefore elevated hair mercury levels (the opposite of what Holmes et al found in their data). This was shown in the Gundacker et al (2007) study, which looked at people who had a documented deficiency in their ability to excrete mercury and found that they had (surprise!) elevated hair mercury levels.

 So, it would appear that Holmes et al, unable to reconcile their study’s findings with their preconceived idea that mercury causes autism, used the deus ex machina of ”impaired mercury excretion” to get them out of trouble.

That’s not even a good practice in fiction writing.

Journal Quality/Peer review/Editorial pressures/Strategic authoring:

I don’t give a lot of weight to the various “quality” ratings of journals. Things like “impact factor” and circulation are often affected by things other than the quality of the articles published in them. The “reputation” of a journal is also an “iffy” thing (and often out of date) – journals with a good reputation can publish (and have published) absolute trash. If they publish enough trash, they become journals with bad reputations.

Another rather over-rated factor is peer-review. Certainly, a journal that is not peer reviewed should not be trusted (any more than you would trust your local newspaper), but there are many journals that have a peer-review process and yet churn out issue after issue filled with junk science and nonsense. Having a peer-review process and having an effective peer-review process are two different things.

Even journals with good, effective peer-review can slip up and let a bad article through. After all, peers are people, too. They have bad days and can miss things - they can also be blinded by their own biases and preconceptions. Editors are also people and have been known to pick sympathetic peers to review a paper they feel is important (for whatever reason) or to even over-ride the reviewers’ adverse recommendations and publish the paper anyway.

Finally, scientific journals are subject to many of the same pressures that affect all other periodicals. They are competing for readership - they need to have people (and libraries) buy subscriptions. They needs readership because most of them sell advertising space and the advertisers pay according to how much “exposure” the journal get. Clearly, anything the editors can do – within reason – to increase the number of people reading their journal is to their advantage.

Several editors have admitted to me that they have published questionable studies (and even fairly obvious baloney) because they knew those articles would stir up a response. And here’s where the “impact factor” formula comes into play: the more often articles in a journal are cited in other articles, the more that journal’s ”impact factor” goes up. And a high “impact factor” not only carries prestige and bragging rights, it increases advertising revenue.

One way a journal can increase it’s “impact factor” is to publish research that makes significant contributions to the field; research that other authors will cite in their papers. This is the ideal way to do it. Another way is to publish research that other researchers will refute in their articles or cite as examples of bad science. Either way, the number of citations go up and the journal’s “impact factor” rises.

This brings us to what I call “strategic authoring”. In academia, it truly is “publish or perish”. No matter how good a teacher you are, tenure committees want to see publications and grants – and granting agencies usually want to see a record of publications before they give out money. The end result is pressure - enormous pressure - to publish a lot of articles.

One result of this is what is known as the “least publishable unit” (LPU) type of publications. These where the data of a single project are parsed out into a series of papers, each dribbling out the smallest amount of data (the LPU) that the editors will accept. This allows a researcher to generate several papers from a single project, upping the number of publications on their record and increasing their chance of getting tenure and grants.

Another tactic – not so widely used, fortunately – is to publish “controversial” findings that the author knows are either preliminary or of poor quality (often referred to as a “Brief Report” or “Preliminary Report”). Researchers who do this know that there is a good possibility that these findings will not be confirmed – their very own research may later overturn them.

The benefit of this strategy is two-fold: first, they get two publications with essentially the same data (a practice otherwise severely frowned upon) - a “quickie” publication with their “preliminary” data and a second publication when the project is finished, whether the final data confirm their “preliminary” findings or not. Secondly, their “citation index” will go up (increasing their chance for tenure and grants) because other people in the field will try to replicate or refute their “preliminary” findings. It’s a “win-win” situation for everybody but the people reading the scientific literature.

 References:

Unless you really know the field, looking through the references may not be a lot of help. Still, you should at least give it a quick read. Things to look for are:

 [1] References that are really just letters to the editor (not acceptable in most cases, but editors and reviewers often don’t have time to check every reference), news items published in scientific journals, abstracts of meeting presentations (or worse, testimony to Congress or Parliment) or (even worse) presentations to non-scientific groups (DAN! conferences, talks at the Rotary Club, etc.).

[Note: the journal Nature refers to most of the research articles it publishes as "letters" - this is not the same as "letters to the editor" (which Nature calls "correspondence") - there are other journals that follow this naming convention.]

[2] Numerous references that were written by one or more of the authors.

This is not necessarily a bad thing, especially if one of the authors has a long history in the field, but it’s something to look for. Some of the worst articles I’ve read - in a variety of fields - cited publications of the author(s) almost exclusively. It’s often a sign of crankery when authors can’t find anyone other than themselves who agree with their hypotheses.

[3]  Numerous references written by the same author(s).

 Again, not necessarily a bad thing, especially in fields that are either very new or very small. However, most authors almost subconsciously try to find a breadth of articles to cite in the papers, so having to rely on the same small group of authors can be a sign that their hypotheses are not shared by the larger scientific community.

Conflict of interest:

This is a subject that gets a lot of “laypeople” confused. They may think that because a researcher is getting money (a “grant”) from - for example - a drug company, that they have a “conflict of interest” and will be inclined to “slant” or “bias” their conclusions in favor of the funding source. Most likely, they are thinking about the tobacco company “scientists” who churned out study after study “showing” that tobacco wasn’t the cause of lung cancer or emphysema.

Fortunately, this is not the way things work for most researchers. For example, I receive funding from a variety of sources – government agencies, private foundations and private companies. This is typical of academic researchers and is even becoming the rule for researchers in government labs. These grants are for a specified amount over a specified time period and are not automatically renewed. My ability to continue getting grants – which I need in order to keep doing research (and drawing a salary) – depends a lot on my reputation for doing good, objective, ethical research.

If I slant my results to favor a certain company, foundation or government agency, someone will eventually try to replicate my results and find out that I was “fudging” the data. Remember Woo-Suk Hwang, the Korean researcher who allegedly faked his cloning research? Well, he got “caught” when several labs tried to replicate his results and couldn’t. It’s likely that he won’t be getting any research funding (or jobs) for the rest of his life. That could be me, if I decided to “skew” my results to please a particular funding source.

On the other hand, there are “scientists” who do behave like the “Big Tobacco” scientists of old. They have less to lose than I (or other academic and government researchers) do, because their “funding” comes from patients or from “advocacy groups”that have an emotional (or financial) interest in a particular outcome. They, like the “Big Tobacco” scientists, are practicing “results-oriented research” - they have a specific result that they want to find, no matter what the data tell them.  

In fact, one of the best modern analogies to the old “Big Tobacco” scientists are the current “Big Autism” scientists. These “scientists” depend solely on the advocacy groups (and parents that follow them) of “Big Autism” and their continued funding depends on them producing results that “Big Autism” likes. They have literally made a Faustian bargain; they have “sold their soul” to “Big Autism” and can’t go back. No other funding source would touch them with a ten-meter stick now, so they have lost the independence to “call ‘em like they see ‘em”.

“Big Autism”, by emulating “Big Tobacco” of old, is falling into the same trap. At some point – as happened with “Big Tobacco” – the weight of data will simply overwhelm “Big Autism’s” pet “scientists” and “the truth will out”. The absolute irony of the situation is that the apologists for “Big Autism” are so fond of pointing out hypothetical “conflicts of interest” in the scientists whose data refutes their claims, yet their own domestic scientists are the ones with a real conflict of interest.

In short, “skewing”, “fudging” or “cooking” the data would be the real conflict of interest for me and other legitimate researchers - it would conflict with my best interests to keep working in my chosen field. By compromising my reputation in order to curry favor with one funding source, I would be condemning myself to only working for them. They would own me. Although, frankly, most funding sources would drop me like a hot potato - even if I “fudged” the data in their favor - because they could also end up paying the price for my ethical lapses, when (not “if”) they finally came to light.

So, when you hear someone yammering on about how this or that researcher has a “conflict of interest” because they receive money from a drug company or a government agency, remember this: scientists who live by grant money (and that’s most of us) cannot afford to “shade the truth” or “skew the data”. It’s not worth the risk of ending a research career in order to make one funding source happy (if, in fact, it would make them “happy”).

On the other hand, those people who only have one funding source can’t afford to not keep them happy. It remains to be seen if “Big Autism” will follow the path blazed by “Big Tobacco”, but they’re certainly heading that way.

Those who do not learn the lessons of history are usually condemned to repeat them.

Coming up – Part 3: Meta-analyses, “pilot studies” and case reports (Oh, my!).

Prometheus

Filed under: Autism Science, Critical Thinking | 10 Comments »