Install Theme

there is no “mainstream consensus” among intelligence researchers

vaniver:

nostalgebraist:

How’s that for a clickbait title? ;)

The motivation for this post was a tumblr chat conversation I had with @youzicha.  I mentioned that I had been reading this paper by John L. Horn, a big name in intelligence research, and that Horn was saying some of the same things that I’d read before in the work of “outsider critics” like Shalizi and Glymour.  @youzicha said it’d be useful if I wrote a post about this sort of thing, since they had gotten the impression that this was a matter of solid mainstream consensus vs. outsider criticism.

This post has two sides.  One side is a review of a position which may be familiar to you (from reading Shalizi or Glymour, say).  The other side consists merely of noting that the same position is stated in Horn’s paper, and that Horn was a mainstream intelligence researcher – not in the sense that his positions were mainstream in his field, but in the sense that he is recognized as a prominent contributor to that field, whose main contributions are not contested.

Horn was, along with Raymond Cattell, one of the two originators of the theory of fluid and crystalized intelligence (Gf and Gc).  These are widely accepted and foundational concepts in intelligence research, crucial to the study of cognitive aging.  They appear in Stuart Ritchie’s book (and in his research).  A popular theory that extends Gf/Gc is knows as the “Cattell–Horn–Carroll theory.”

Horn is not just famous for the research he did with Cattell.  He made key contributions to the methodology of factor analysis; a paper he wrote (as sole author) on factor analysis has been cited 3977 times, more than any of his other papers.  Here’s a Google Scholar link if you want to see more of his widely cited papers.  And here’s a retrospective from two of his collaborators describing his many contributions.

I think Horn is worth considering because he calls into question a certain narrative about intelligence research.  That narrative goes something like this: “the educated public, encouraged by Gould’s misleading book The Mismeasure of Man, thinks intelligence research is all bunk.  By contrast, anyone who has read the actual research knows that Gould is full of crap, and that there is a solid scientific consensus on intelligence which is endlessly re-affirmed by new evidence.”

If one has this narrative in one’s head, it is easy to dismiss “outsider critics” like Glymour and Shalizi as being simply more mathematically sophisticated versions of Gould, telling the public what it wants to hear in opposition to literally everyone who actually works in the field.  But John L. Horn did work in the field, and was a major, celebrated contributor to it.  If he disagreed with the “mainstream consensus,” how mainstream was it, and how much of a consensus?  Or, to turn the standard reaction to “outsider critics” around: what right do we amateurs, who do not work in the field, have to doubt the conclusions of intelligence-research luminary John Horn?  (You see how frustrating this objection can be!)


So what is this critical position I am attributing to Horn?  First, if you have the interest and stamina, I’d recommend just reading his paper.  That said, here is an attempt at a summary.

Keep reading

I disagree with several parts of this, but on the whole they’re somewhat minor and I think this is a well-detailed summary.

Note how far this is from Spearman’s theory, in which the tests had no common causes except for g! 

Moving from a two-strata model, where g is the common factor of a bunch of cognitive tests, to a three-strata model, where g is the common factor of a bunch of dimensions, which themselves are the common factor of a bunch of cognitive tests, seems like a natural extension to me. This is especially true if the number of leaves has changed significantly–if we started off with, say, 10 cognitive tests, and now have 100 cognitive tests, then the existence of more structure in the second model seems unsurprising.

What would actually be far is if the tree structure didn’t work. For example, a world in which the 8 broad factors were independent of each other would totally wreck the idea of g; a world in which the 8 broad factors were dependent, but had an Enneagram-esque graph structure as opposed to being conditionally independent given the general factor would also do so.


When it comes to comparing g, Gf, and Gc, note this bit of Murray’s argument:

In diverse ways, they sought the grail of a set of primary and mutually independent mental abilities. 

So, the question is, are Gc and Gf mutually independent? Obviously not; they’re correlated. (Both empirically and in theory, since the investment of fluid intelligence is what causes increases in crystallized intelligence.) So they don’t serve as a replacement for g for Murray’s purposes. If you want to put them in the 3-strata model, for example, you need to have a horizontal dependency and also turn the tree structure into a graph structure (since it’s likely most of the factors in strata 2 will depend on both Gc and Gf).


Let’s switch to practical considerations, and for convenience let’s assume Caroll’s three-strata theory is correct. The question them becomes, do you talk about the third strata or the second strata? (Note that if you have someone’s ‘stat block’ of 8 broad factors, then you don’t need their general factor.)

This hinges on the correlation between the second and third strata. If it’s sufficiently high, then you only need to focus on the third strata, and it makes sense to treat g as ‘existing,’ in that it compresses information well.


This is the thing that I disagree with most strenuously:

In both cases, when one looks closely at the claim of a consensus that general intelligence exists, one finds something that does not look at all like such a consensus. 

Compared to what? Yes, psychometricians are debating how to structure the subcomponents of intelligence (three strata or four?). But do journalists agree with the things all researchers would agree on? How about the thugs who gave a professor a concussion for being willing to interview Charles Murray?

That’s the context in which it matters whether there’s a consensus that general intelligence exists, and there is one. Sure, talk about the scholarly disagreement over the shape or structure of general intelligence, but don’t provide any cover for the claim that it’s worthless or evil to talk about a single factor of intelligence.

Answering your last question first: I wrote the OP for the purpose of disrupting what I saw as a frustrating stalemate in the discussions I periodically have about this stuff with other science/statistics geeks.  You write

But do journalists agree with the things all researchers would agree on? […] That’s the context in which it matters whether there’s a consensus that general intelligence exists, and there is one.

But that isn’t the only context in which it matters, and it’s not the one I had in mind when writing the OP.  The context I have in mind is me talking to other people who are normally interested in talking about thorny methodological issues and contrarian academic positions.  But of course, that only goes so far; if I started saying the earth was flat or that quantum mechanics was wrong, I would expect these people to brush me off, on the (implicit or explicit) reasoning that on these matters there really is an expert consensus.  These are questions that are treated as “no longer open” by everyone who knows what they’re talking about, bar none (and I mean that literally).

Now, obviously any consensus in social science is not going to be that firm.  But the reactions I get when I talk about g critiques are similar – people think there is enough of a consensus to pass the “no longer open” threshold.  And, okay, fine, you have to put that threshold somewhere.

I was trying to suggest that the “g exists” consensus should not be considered beyond that threshold.  I did so by citing a major, respected figure in the field who denied that g existed.  And this wasn’t just a matter of him having a weird opinion he mouths off about on his blog after hours; I linked a paper he wrote for a volume called Factor Analysis At 100, based on a conference he gave a talk at, because he made significant contributions to factor analysis (in addition to his role in creating Gf-Gc theory!).  The guy was as qualified as anyone in the world to talk about how to interpret these psychometric factor models, and talk he did, at the invitation of his colleagues.

Now, okay, is a single contrarian luminary enough to break a consensus?  I admit that saying “there is no consensus, because John Horn existed” is like saying “there is no expert consensus about anthropogenic climate change, because Richard Lindzen exists.”  And if I wrote a post saying that, I recognize that it would seem polemical and obnoxious (and thus the same is true of the OP).

But in a debate climate where everyone leaned heavily on the “expert consensus on AGW” and treated skeptics as “questioning what is beyond doubt,” the “no consensus bc Lindzen” post would actually be a useful corrective.  As it is, the climate blogosphere does a lot more serious engagement with skeptical arguments than I’ve seen in the IQ blogosphere; the equivalent of something as in-depth and careful as Skeptical Science, but for rebuttals of IQ criticism, would constitute a major advance over the sophistication of the debate as it currently stands.  So that’s the direction I’m trying to move things.

(”Don’t call this into question, because it will only encourage the bad kind of skeptics” is a really bad argument if you want to convince serious and curious people who are currently undecided.  Much better to raise the questions and then answer them, as Skeptical Science does.)


As for your other points – there are two different answers to each, since the OP was a summary of Horn’s position, not of my own.  So I can tell you what Horn thinks, and/or what I think.

Regarding this:

This hinges on the correlation between the second and third strata. If it’s sufficiently high, then you only need to focus on the third strata, and it makes sense to treat g as ‘existing,’ in that it compresses information well.

Horn disagrees that this “makes sense” (see pp. 25-9 of his paper).  For my part, I think that if by “exists” we mean “compresses information well,” then we can automatically get “g exists” from “positive manifold + high correlations.”  If the variance in your data is largely along a single line, then you can compress your data pretty well by just recording the projection of each observation onto that line.  (Equivalently, by retaining only the first principal component in PCA.)

What makes factor analysis different from PCA is that it tries to make observed variables uncorrelated, conditional on the factors they share.  (That is, it tries to find low-rank approximations to the correlation matrix which get the off-diagonal entries right, ignoring the diagonal.)  This has a causal interpretation: two things should be uncorrelated if you condition on their common causes.  In factor analysis, the idea is that the factors cause the observed variables.

If you construct a set of k factors (k>1) with this property, they won’t be unique.  You are approximating your data by its projection into a k-dimensional subspace, and your factors are vectors spanning that subspace; you can choose any other basis for the subspace and get the same result.  The usual approach for choosing the basis is trying to get “simple structure,” which basically means the causal graph has as few edges as possible.  This generally makes the factors correlate with one another; if you want them to be uncorrelated, you can always get that, but at the cost of losing simple structure.

Of course, now that your factors are correlated, you are in the same situation you started out with: a bunch of correlated variables.  You can then do factor analysis to your factors, yielding “second-order” factors.  Of course, there will always be fewer second-order factors than first-order factors: as in the first stage, we are trying to construct a low-rank approximation to the correlation matrix, i.e. one with rank lower than the # of variables being correlated.  And again, if you go for simple structure, the second-order factors will be correlated, so you can go to stage 3 and generate third-order factors, of which there will be even fewer.

Carroll’s three-stratum theory has g as a third-order factor, BTW, where for Spearman it was a first-order factor.  When you write

Moving from a two-strata model, where g is the common factor of a bunch of cognitive tests, to a three-strata model, where g is the common factor of a bunch of dimensions, which themselves are the common factor of a bunch of cognitive tests, seems like a natural extension to me.

you are missing a level – Spearman’s was “one-stratum.”  (Carroll has g as the common factor of things which are common factors of things with are common factors of the tests.)

Now, note that this procedure (hierarchical factor analysis) can only ever produce causal graphs with this kind of structure: trees with fewer nodes as you go up.  You write:

For example, a world in which the 8 broad factors were independent of each other would totally wreck the idea of g; a world in which the 8 broad factors were dependent, but had an Enneagram-esque graph structure as opposed to being conditionally independent given the general factor would also do so.

The latter structure is literally impossible to produce via hierarchical factor analysis, although surely such structures appear sometimes in nature (see below).  The former you can totally get via factor analysis – you can literally just ask for uncorrelated factors and get them in any dataset – but at the cost of simple structure.  Simple structure amounts to asking that your factors be “as un-general as possible”: each should have edges to as few variables as possible.  But since asking for simple structure gives you correlated factors, you then have a rationale for adding another layer, on which there will be fewer variables.  As of 1993, when Carroll surveyed 477 studies, the number of second-order factors identified varied from 0 to 5, with each study finding either 0 or 1 third-order factor; a third-order factor was more likely if there were more second-order factors.  With more dimensions in the initial data, one could generate more factors at each level, perhaps generating more than one at third-order, which would seem worrisome to g … until you have enough variables to go to fourth order, and then you’ll only have one at first, and g will be reborn.  The whole thing seems pretty silly.  (If you have access to Carroll’s 1993 book, his pp. 578-583 is worth looking at in this connection.)

I mentioned that these tree structures are not the only causal graphs that appear in nature.  Clark Glymour has some amazing search algorithms that, far from being constrained to trees, can discover really gnarly causal graphs among like 24 variables from data alone (you can try them out yourself), but since no one actually uses them, this is the sort of thing that gets called an isolated demand for rigor when applied to IQ.  So, I’ll just mention that Glymour tried (The Mind’s Arrows, Ch. 14) generating some data from a variety of causal graphs, for instance these

image
image

… and plugged the results into standard factor analysis programs, using standard methods for computing the number of factors to use.  And they generally could not identify the correct number of latent causes.  (There are two ways of defining the right answer; one program got it right 2/7 of the time either way, the other 2/7 or 4/7.)  They did poorly even on some cases like Graph 7 that have the appropriate tree structure.

Anyway, these are the kinds of methods that Carroll, and everyone else, uses to make their causal models.

Finally, you write:

When it comes to comparing g, Gf, and Gc, note this bit of Murray’s argument:

In diverse ways, they sought the grail of a set of primary and mutually independent mental abilities.

So, the question is, are Gc and Gf mutually independent? Obviously not; they’re correlated. (Both empirically and in theory, since the investment of fluid intelligence is what causes increases in crystallized intelligence.) So they don’t serve as a replacement for g for Murray’s purposes. If you want to put them in the 3-strata model, for example, you need to have a horizontal dependency and also turn the tree structure into a graph structure (since it’s likely most of the factors in strata 2 will depend on both Gc and Gf).

Gf and Gc are in the 3-stratum model, as two of the second-order factors.  (I mentioned this in the OP.)  And again, you can get always independent abilities, just at the cost of simple structure.  (Sometimes people talk as though the simple structure solution is the “correct” one, in that it is especially likely to identify the real phenomenon; this makes sense on Occam’s Razor grounds, but in practice simple structure factors for intelligence tests are a Babel of conflicting possibilities; see Horn pp. 40-43.)

  1. lowercase-morass reblogged this from argumate
  2. lexisjourney reblogged this from argumate
  3. argumate reblogged this from argumate
  4. sophia-epistemia reblogged this from loki-zen and added:
    a general factor of anything always suffer from the flaw of averages
  5. loki-zen reblogged this from argumate
  6. funnysnake2019 reblogged this from argumate and added:
    interesting how you’re just describing Spearman’s law of diminishing returns: the correlations between subtests...
  7. runningwithmemes reblogged this from argumate
  8. the-erpson reblogged this from argumate
  9. nostalgebraist posted this