Tuesday, 26 August 2014

Woodley elaborates

 

What I said in London was that whilst it is true that correlation does not necessarily equate to causation, all causally related variables will be correlated. Thus correlation is always necessary (but not in and of itself sufficient) for establishing causation.

The claim that 'correlation does not equal causation' is therefore meaningless when used to counter the results of correlative studies in which specific causal inferences are being made, as the inferred pattern of causation necessarily supervenes upon correlation amongst variables. Whether the variables being considered are in actuality causally associated as per the inference is another matter entirely.

The correct critique of such findings therefore is from mediation, i.e. the idea that a given correlation might be spurious owing to the presence of 'hidden' variables that are generating the apparent correlation. A famous example is yam production and national IQ, which across countries correlate negatively. It would be wrong to say that yam production somehow inhibits IQ, as the association will in fact turn out to be mediated by something like temperature and latitude. These variables are in turn proxies for historical and ecological trends that make the sort of countries that yield fewer yams the sort of countries that are typically populated by higher ability people, and vice versa. The causation in this case is via additional variables, which cause the covariance between the two variables of interest, without there being a direct effect of one on the other.

Properly constructed multivariate models can use these patterns of mediation to infer the likelihood of causation going in one direction or another. Thus it is possible to actually test causal inference amongst a population of correlated variables. By far the best way of doing this is to compare the fits of models containing specific theoretically prescribed patterns of causal inference against (preferably many) alternative theoretically plausible models, in which alternative patterns of causation are inferred (Figueredo & Gorsuch, 2007).

Sir William Gemmell Cochran termed this “Fisher’s Dictum‟:


“About 20 years ago, when asked in a meeting what can be done in observational studies to clarify the step from association to causation, Sir Ronald Fisher replied; `Make your theories elaborate.' The reply puzzled me at first, since by Occam's razor, the advice usually given is to make theories as simple as is consistent with known data. What Sir Ronald meant, as subsequent discussion showed, was that when constructing a causal hypothesis one should envisage as many different consequences of its truth as possible, and plan observational studies to discover whether each of these consequences is found to hold. (Cochran, 1965, §5).

 

Ref.
Cochran, W. G. (1965). The planning of observational studies of human populations
(with Discussion). Journal of the Royal Statistical Society. Series A, 128, 134–155.
Figueredo, A. J., & Gorsuch, R. L. (2007). Assortative mating in the jewel wasp. 2.
Sequential cononical analysis as an exploratory form of path analysis. Journal of
the Arizona-Nevada Academy of Science, 39, 59-64.

9 comments:

  1. Cochran only proved how very very stupid and vulgar ALL Scotsmen are.

    The claim that 'correlation does not equal causation' is therefore meaningless when used to counter the results of correlative studies in which specific causal inferences are being made, as the inferred pattern of causation necessarily supervenes upon correlation amongst variables.

    The causation in this case is via additional variables, which cause the covariance between the two variables of interest, without there being a direct effect of one on the other.


    OMG! Jimmy might just realize that ALL social "science" is 100% bullshit.

    --- BGI volunteer

    ReplyDelete
  2. And not 100% bullshit necessarily, but because ALL social "scientists" are mentally retarded.

    ReplyDelete
  3. learning to pearl psychologists will never do.

    hence a degree in maths is better prep for a grad course in any soc sci than a degree in soc sci.

    soc sci is a JOKE.

    --- BGI volunteer

    ReplyDelete
  4. Cryptography is an intriguing counterexample. Cryptography is about concealment. And that means the deliberate, manipulated concealment of correlation. Decryption always brings back the correlation.

    If we exclude such deliberate attempts to conceal correlation (as a jury might exclude testimony by known perjurers), does Woodley’s principle still hold?

    ReplyDelete
  5. Cryptography is simply a limiting case, where in some cases we can prove that not only can no one infer the underlying causality (seed, message, image) in practice, no one can in theory. If Woodley disagrees, I can always present a collection of bitstrings and ask him to whip out his Pearson's r and Spearman's rho and whatnot and tell me which were generated by a quantum random number generator at random.org and which were generated by Yarrow...

    > does Woodley’s principle still hold?

    PRNGs in general demonstrate the necessary properties (with the caveat that if you already know everything about the system but the seed, a determined attacker/researcher can figure it out from enough samples of the output).

    More generally, complex nonlinear problems like those tackled in machine learning will often defeat something like Pearson's r. The 'donut' graph is the standard example used in intros for beginners why you can't just throw linear models at these problems with any degree of success. The relationships are too complex for such oversimplifications.

    I think systems with feedback and control mechanisms (which describes a good chunk of the world...) will often produce uncorrelated causation, but I'm not 100% sure on that.

    ReplyDelete
  6. I myself stand by the strong statement:

    Correlation ≠ causation.

    To me, it seems the number of instances of where people utter that statement to dismiss a potentially important correlation possibly signalling a causal relationship is far dwarfed by the number of people, even academics, (*cough*, epidemiology, *cough*) that automatically assume correlations represent causal relationships (because they've controlled for this or that confounding factor or whatever). I think it's still important to say it, but call people out when they misuse it.

    ReplyDelete
  7. Michael A. Woodley31 August 2014 at 22:26

    Cryptography does not count for the reasons articulated by the previous commenter. To reiterate, correlation in this instance is simply being deliberately hidden via encryption. The clue here is actually in the word crypt, which stems from the Greek kryptós meaning hidden or secret.

    I will say it one more time - all causal relationships are necessarily correlated. Whether the correlation exists in plain sight, or is known only to the 'Mind of God' (as in the case of encryption) has no bearing on the validity of this axiom.

    ReplyDelete
  8. The best description I've seen is that correlation is "an unexplained causal nexus".

    This acknowledges that correlations always reflect causes; does not make the (false) counterclaim that lack of correlation proves absence of causal relations (see above); and implies (correctly) that correlation alone can't elucidate these causes for us.

    To say that causes always result in correlations and then add that by correlation is meant "what God sees' just confuses the situation.

    Learn your Pearl is great advice :-) Eysenck (and of course Fisher) are also worth reading in addition to Pearl on this topic. As are a couple of others - for instance Chang's excellent "Inventing Temperature". People long knew, for instance that exposure to fire raised temperature, but the theories (i.e., models of causality) were terribly varied (and often terribly wrong) until the kinetic model. Which, of course, may also be supplanted. Correlation all along. Causality all along. Good understanding of it, effortful, deliberate, inspired, and fallible.

    http://www.amazon.com/Inventing-Temperature-Measurement-Scientific-Philosophy/dp/0195337387/ref=sr_1_1?s=books&ie=UTF8&qid=undefined&sr=1-1&keywords=history+of+temperature

    ReplyDelete
  9. > To reiterate, correlation in this instance is simply being deliberately hidden via encryption.

    Wow, and that matters... why? Whether a human-designed process or a natural process too hard for humans to figure out, the point remains.

    > Whether the correlation exists in plain sight, or is known only to the 'Mind of God'

    Ridiculous. You're playing these word games to try to defend your inclination to infer causality from every kind of correlation (eg http://drjamesthompson.blogspot.com/2014/05/microcephalin-makes-comeback.html?showComment=1401462239361#c3452076013251101477 ) and your defense to counterexamples like dynamic systems like thermometers, nonlinear systems like cryptography, and selection or conditioning effects is to appeal to... *the mind of god*? All you're doing is redefining correlation as causation and saying 'of course things that cause each other are correlated even if we can't see it, because correlation is another word for causation and obviously things that cause each other are causally related!' That is not what anyone has ever meant by 'correlation' and shame on you for your crimes against statistical communication.

    Unfortunately, in the real world, we only have statistics, and we must refer any use of the mind of God to the next world; in the real world, correlation!=causation, and things can be correlated without causing each other and can cause each other without any tractably computable correlation. Not everything is linear; not everything is a Gaussian; not everything is aptly summarizable by Pearson's r; not everything can be solved by appeal to correlation.

    ReplyDelete