Testing intelligence used to be a simple business. The patients used their wits, and the psychologists used their instruction manuals. Some instruction and practice was required, because psychologists had to learn the instructions to be given for each test at every stage, including the prompts; learn how to record the answers and also time them with an old mechanical stopwatch; do all this when the material in front of you on the patient’s side was upside down and left right inverted; record any response which was out of the ordinary, and keep the patient cheerful and engaged throughout. To help you, the presentation booklets had discreet little numbers for you to read, if only to check that you were presenting the right problem. There were also recording forms to jot down the results, and prompts about how many failures patients were allowed before you moved briskly to the next subtest.
Block design, object assembly, picture completion and picture arrangement all required some kit, which had a tendency to get battered or lost. Coding required a form on the back of the test record booklet, and a cardboard overlay to mark the results quickly.
A mechanical stopwatch, I should explain, was a large, heavy, metallic chronometer which never ran out of batteries, and was easy to use. Multiple lap time analysis was not an option, nor were nano-seconds, so error rates in recording times were low. More sophisticated testers were provided with a chronometer wrist watch, so that timing could be done discretely, without the person noticing it and getting too anxious. I was taken on special journey by my boss to a specialist watch shop in the City of London in order to get the numbered chronometer placed on my wrist. It was a Moeris Antimagnetic, Swiss made watch, and it still works well.
A psychologist of modest intellect could be trained to use all these materials in a matter of weeks, and then they were tested on a patient or two by a senior psychologist, after which they were considered competent to begin their testing careers.
In the old days of testing, psychologists tested lots of people, so they started taking short cuts. They boiled the instructions down to the sensible minimum, having found out that the basic short form generally make more sense than the elaborate longer one. Then they started cutting out tests, on the “bang for your buck” basis. Bluntly, how long does it take to get a result out of each subtest? Some are easy to give and score. Others require setting out lots of material, gathering it back again, and require you to work through complicated scoring systems. Those tests tended to be left in the bottom drawer of the desk. Psychologists may be dull but they are not stupid.
Eventually researchers worked out statistically derived short forms in which 4 key subtests gave roughly the same result as the full 10. Roughly. Any psychologist who was in a hurry plumped for those. Of course, the error term was larger, but pragmatism ruled. As a consequence, a very large number of intelligence test results are not done properly in that they are not based on the full test. It is hardly surprising later that scores on re-testing may differ, particularly when psychologists pick and choose which tests to include out of the 10, according to their own interests and theories. Short form testing also increase the apparent variability of the results, leading some gullible psychologists into thinking that it was wrong to calculate an overall result, when in fact that overall result had higher validity. Nobody gets round sampling errors, not even the Spanish Inquisition.
When new tests came on the market they usually provided extremely interesting approaches with extremely complicated and bulky equipment. Take Digit Span, for example, which tests short term memory. This now comes in a more complicated form, but might be useful. Then, in the Wechsler memory tests, someone decided to have a sequence tapping test of “spatial memory”. You were required to set out the provided array of blue blocks welded onto a plastic tray, and then tap out a sequence of moves with your finger, which the patient had to copy. No problem when one or two moves were required. However, when the tapping sequence was 7 different positions long, it was difficult to be sure one had tapped out the sequence correctly, and then baffling when the patient tapped out the sequence back again so quickly that you could not be sure you had recorded it correctly. That test has been quietly dropped. One cannot have the punters showing up the psychologists.
However, the search for the quick test that gives a valid result continues. The task is not a trivial one. Here are the g loadings of the Wechsler Adult Intelligence Scale subtests, simply as a guideline on the competitive psychometric landscape which confronts any developer of a new intelligence subtest. These are taken from Table 2.12 of the WAIS-IV manual.
Vocabulary .78 Similarities .76 Comprehension .75 Arithmetic .75 Information .73 Figure Weights .73 Digit Span .71 are all good measures of g.
Block Design .70 Matrix Reasoning .70 Visual Puzzles .67 Letter-number sequencing .67 Coding .62 Picture Completion .59 Symbol Search .58 are all fair measures of g.
Cancellation .44 is a poor measure of g but has remains one of the optional subtests.
Each of the subtests, particularly the top 7 are good measures of g, and none of them take more than 10 minutes each, and most of them less. They provide plenty of psychometric bang for your testing-time buck. With a bit of practice in memorising the scoring criteria, you can almost mark up the vocabulary score as you go along.
So, here is the ultimate intelligence test item for intelligence testers. Can you think of a task which is quicker and easier than the best Wechsler subtests, but has higher predictive utility?
While thinking about that, would you like to take a non-Wechsler vocabulary test, just for private amusement and to provide a quick general intelligence measure that you can keep to yourself?