Sunday, November 04, 2007

Science 102: epidemiology vs. controlled studies

In my previous post in this series I discussed some of the pitfalls that can lure you in to drawing false conclusions from experimental data, including confounding factors, mistakes, bad luck, and pure random chance. All those factors arise in the context of controlled experiments. In a controlled experiment there are two groups of test subjects which are made as much alike as they can be. (There is an entire industry devoted to breeding genetically identical rats for use in laboratory experiments.) These two groups are subjected to conditions that are as much alike as they can be made, save for one factor, which is the object of interest (usually a drug or some other chemical). All this effort is made in an attempt to eliminate confounding factors. It doesn't always work, and even when it does pure random chance can produce false results in a surprisingly large number of cases.

But there are many cases where a controlled experiment is not possible. We can do pretty well with animal models, but when it comes time to run experiments on humans we don't usually have access to large numbers of genetically identical test subjects. Instead, a technique called randomization is used, so that the two groups, while not identical, are unlikely to be biased in any particular direction by any confounding factor. Part of the process of designing a study is (or at least ought to be) doing the math to figure out how many test subjects you need so that randomization gives you the desired low probability of confounds.

But sometimes it is not possible to do a controlled study because controls are just too difficult to enforce. Suppose you want to know, say, if eating carrots reduces the risk of cancer. Cancer is a very slow disease, taking years to manifest itself. It would be all but impossible to rigorously enforce a protocol where one group of test subjects consumed a known quantity of carrots while a control group ate none over a period of years.

In situations like this scientists fall back on what is known as epidemiological studies. These are named after the science of epidemiology, which is, naturally, the study of epidemics (where, for obvious reasons, it is often impossible to do controlled studies). But the methodology of epidemiology can be -- and is -- applied far more broadly.

The basic idea behind epidemiology is that when you can't go through the usual process of assembling treatment and control groups, sometimes you can go back and look at people's history and assign them to the proper category retroactively. For example, we might take 1000 people and ask them if they eat carrots regularly, and then see if the ones that say they do get less cancer than the ones that say they don't.

The problem with this approach is that you don't get to randomize the two groups, and so the possibility of confounding factors is much higher. Supposed we find 500 carrot-eaters and 500 non-carrot eaters and discover that the non-carrot-eaters had 50 cancers among them while the carrot-eaters had only 40. Would we be justified in concluding that eating carrots reduces the risk of cancer by 20%?

No, we would not. For one thing, these results might not be statistically significant even in a controlled study! (Whether they would be or not depends on a lot of factors, a discussion of which would take us far afield.) But let us put that aside and assume that this is a statistically significant result. How can we be sure that carrots are the cause of the reduction in the incidence of cancer? It is possible that eating carrots coincides with other healthy lifestyle choices -- like exercising regularly for example -- and that it is exercise, not carrots, that produces the beneficial effect. In a controlled study this wouldn't be a problem because the subjects would be randomized, and presumably you'd have the same number of exercisers and non-exercisers in each group. But in an epidemiological study we do not have that luxury.

There are statistical techniques to get around this problem. Basically, the idea is to divide up a large group of people in various ways so that you essentially make "virtually randomized" treatment and control groups out of them retroactively. I don't have time to go into details, but the point is that it can be done. So supposed we ask all our test subjects: do they exercise? Do they smoke? Do they live at high altitudes? Near nuclear power plants? Near power lines? How often do they talk on their cell phones? We collect all this data and we do the statistical slice-and-dice and lo-and-behold a signal arises from all the noise that indicates with 95% confidence that indeed easting carrots does reduce the risk of cancer by 20%. Are we now justified in concluding that this is actually true?

It should come as no surprise by now that the answer is still no. The reason is that there is always the possibility that there is a confound that we might have not considered and hence forgotten to put into our questionnaire. How likely is that? The only way to be sure that it's really the carrots and not something else is to follow up with a controlled study. Let's stipulate that this is too difficult. Are we screwed?

Not completely. There are two other things we can do. One is to look at the questionnaire and see how thorough it is. If we submit the study to peer review and no one can think of anything that we should have asked about and didn't then that's a pretty good indication (though far from foolproof) that we're on the right track with the carrots. But there is another thing we can do, and this really gets at the heart of science: we can ask why eating carrots should reduce the risk of cancer.

Science is not just about doing experiments and crunching numbers. Science is really about explaining things. Experiments are not a tool for directly getting at the truth, they are a tool for helping decide between alternative explanations.

So one possible explanation (in science we call these hypotheses) of why carrots might reduce the risk of cancer is that they contain chemical substances which neutralize the effects of various carcinogens that everyone is exposed to in the course of day-to-day life. The reason that this is progress is that this explanation can be tested in ways other than feeding people carrots. For example, we can try to identify these chemicals and see if they occur in other foods, and see if eating those foods also reduces the risk of cancer. We can also try to extract or synthesize those chemicals and see if consuming them as dietary supplements makes a difference. (Turns out that often they don't.)

Have you ever wondered why you seem to hear different advice about what to eat to reduce your risk of cancer every time you turn around? It's because most of the results of epidemiological studies are wrong!

Which brings me to flamingos. As everyone knows, flamingos are pink. Famously so. The statistical correlation between being a flamingo and being pink is really off the charts. And yet a flamingo's pinkness is not genetic, except in a very roundabout sense. Flamingos are pink because their natural diet consists mainly of shrimp, which are high in beta-carotene, which has a reddish color. It's the same chemical that makes carrots orange. (Ironically, beta-carotene supplements appear to increase the risk of cancer!) The beta carotene turns their feathers pink. Feed a flamingo something other than shrimp and their feathers revert to white, their "natural" color. (The same mechanism is what makes wild salmon pink. Farm-raised salmon are white, which is why they have artificial color added to make their flesh look more "natural". Gotta love the irony.)

There's more to say on this but I have to stop now. I guess there will be a third installment of this series. If you want a sneak preview, go buy a copy of David Deutsch's book "The Fabric of Reality" and read chapters 3, 4 and 7.

Here endeth the lesson. :-)

8 comments:

  1. Good points, and interesting factoid about flamingos. I had heard the one about salmon. (I believe the EU has begun to limit the amount of artificial coloring that can be added to their diet. Supposedly it has been found to harm eyesight.)

    However, you didn't write much I didn't already know about the pitfalls of epidemiological-type studies. I know they're weak and prone to chance influences in general. I know that only the strongest results that are consistent across different studies can be relied upon. And I think that, what we were arguing about before, is an example of such a strong and consistent result.

    ReplyDelete
  2. I know that only the strongest results that are consistent across different studies can be relied upon. And I think that, what we were arguing about before, is an example of such a strong and consistent result.

    You've completely missed the point. It doesn't matter how strong and consistent the epidemiological data is. Epidemiological data alone can never prove causality (but it can disprove it, and in Lynn's case I believe it actually does -- but that will have to wait). The strength of the epidemiological data supporting the hypothesis that flamingos are genetically pink is breathtaking -- practically indistinguishable from 100% correlation. It is nonetheless false that flamingos are genetically pink.

    ReplyDelete
  3. Not true. A good epidemiological study would attempt to include flamingos from diverse populations to eliminate confounding factors, including diet. Even if the first studies did not include flamingos that do not eat shrimp, if there are hundreds of flamingo studies in various places over decades, some should include flamingos that do not eat shrimp - even if the only such flamingos are zoo-bred.

    Like you said, controlled studies that need to span years are extremely difficult - and likely unethical - to perform on humans. For the most part, epidemiological studies are what we have to rely on. What we can do is, we can try to conduct several of these studies in ways that make it as likely as possible that we have eliminated any confounding factors. I think that in the IQ case this has largely been done.

    What environmental factor can you suggest that would explain consistently low average IQ results in sub-Saharan African descendants regardless of the time and place - be it in Africa or the US or the Caribbean or Europe or Asia? What kind of "shrimp" are all these populations eating to account for these effects? Note that discrimination cannot be it, as some of these locations are majority black, and sufficiently well developed that malnourishment cannot explain things either. Meanwhile, you have poorly fed, poorly schooled Chinese whose average scores are 105.

    I'd be interested to learn what you mean by saying that there's data in Lynn's book that disproves the genetic hypothesis.

    ReplyDelete
  4. Here's an interesting article on BBC News:

    Gene 'links breastfeeding to IQ'

    A single gene influences whether breastfeeding improves a child's intelligence, say London researchers.

    Children with one version of the FADS2 gene scored seven points higher in IQ tests if they were breastfed.

    But the Proceedings of the National Academy of Sciences study found breastfeeding had no effect on the IQ of children with a different version.

    ...

    "In the past people have had different results about whether breastfeeding improves IQ and this would sort out the reason why," she said.


    This doesn't go one way or the other in terms of our debate, but it's related, and it's an interesting result. It's also related to something else I wrote recently - the general thought being that when broad studies have inconclusive or conflicting results, it may be because an issue only afflicts a small proportion of the population, and it may be down to a single gene, or another circumstance, to determine whether someone is adversely affected or not. It's hard to pick up and focus on individual reactions in a study.

    ReplyDelete
  5. What would be interesting to know is whether the 10% with a different version of the gene - the ones for whom breastfeeding does not affect IQ - are... stupider.

    Perhaps the reason why breastfeeding doesn't improve the IQ of those children is because they don't have the gene to break down the fatty acids from milk in the first place, so they can't take advantage of that extra nutrient. As a consequence, their average IQ could be lower.

    If that were indeed the case, then this would show that there can be a single gene that can impact IQ significantly. It wouldn't take many such differences in genes - perhaps just a few - to explain race differences in IQ entirely genetically.

    ReplyDelete
  6. Ron, I've been enjoying your series on science and race and IQ. (My favorite part of this blog post was the white-flesh salmon. I knew about the flamingos, but not about the salmon!)

    Just thought, on the whole genetics vs. environment for intelligence, you might enjoy these articles about IQ and environment. They're from Flynn, the guy who discovered that IQ has been rising significantly in the US over the last few generations (and, as a consequence, that IQ tests have been "renormalized" to maintain a US average at 100).

    The basic summary seems to be that environment may have more of an impact than we realize, but the effect appears to be genetic because there is a positive feedback loop between genes and environment. (E.g.: you like to practice stuff that you're already good at.) So: a small genetic difference in intelligence will be amplified over time (via self-selecting and self-sought environmental influences) into large differences in final intelligence performance.

    I'm not sure if this changes the social policy at all, because it still appears that there is little one can do about it. But the theory is an interesting intermediate point between "it's all genes" and "it's all environment".

    ReplyDelete
  7. Don, thanks for that interesting link. That article is interesting in and of itself, but after reading it I was able to find one that I find yet more interesting.

    In July 2007, Psychological Review published this paper by Michael Mingroni, who makes reference to the Dickens & Flynn model described in the article you linked to.

    However, Mingroni proposes what he thinks is a more plausible mechanism for the Flynn effect. Because the abstract is available online but the article is behind a paywall, I quote an important passage from the middle:

    "Briefly, heterosis is a genetic effect that will cause populationwide changes in a trait whenever three conditions are met. The first condition is that the population in question must initially have a mating pattern that is less than completely random prior to the occurrence of the trend. Such a deviation from panmixia, or random mating, creates an excess of homozygotes in the population and a deficit of heterozygotes. Second, the population must undergo a demographic change toward a closer approximation to random-mating conditions. This causes the frequency of homozygotes to decline and that of heterozygotes to increase. Of course, this second condition presupposes that the first condition is already met, as a trend toward more random mating cannot occur in a population already mating randomly. Third, the trait in question must display directional dominance, with more of the genes that influence the trait in one direction being dominant and more of those that influence the trait in the opposite direction being recessive. Given such nonadditive gene action, any increase in the ratio of heterozygotes to homozygotes will cause the distribution of the trait to shift over time in the dominant direction. Heterosis has been mentioned as a potential cause of the IQ trend by a number of researchers over the years (Anderson, 1982; Flynn, 1998; Jensen, 1998; Kane & Oakland, 2000; Mingroni, 2004; see also Dahlberg, 1942, chap. 10). Few would dispute that heterosis could be responsible for at least some part of the trend; what is mainly at issue is whether it could be a major cause."

    I find it interesting reading.

    ReplyDelete
  8. To follow up - among other things, the paper devotes a full page to exposing faults of the Dickens & Flynn model, and does so convingly. Mingroni then goes on to present his case for heterosis, which he argues could explain not only the IQ paradox, but also similar unresolved paradoxes (genetics vs. environment) in trends such as height, myopia, etc. The arguments are surprising and compelling, consistent in a wider view of things than just intelligence alone.

    I would like to see his paper followed up by research to test his hypothesis.

    ReplyDelete