Wednesday 7 December 2016

Faking good on PISA


Faking good on PISA


One of the delights of being a member of a community of researchers in the modern age is the speed with which colleagues can come together to answer a question and scope out a solution to a problem.

Steve Sailer has looked at the most recent PISA results, which he has been discussing generically for many years.

He pointed out that in some countries a large proportion of eligible children don’t show up in the statistics. Could it possibly be the case that they are discretely told to stay at home, because national pride is at stake? Perish the thought! He pointed out that Argentina had apparently made stellar gains, but a commentator on his blog pointed out later that there was so much cheating in the Argentine provinces that the results had to be discarded, and the declared results are for Buenos Aires only, so probably higher than the national figures, or so the porteños would have you believe. Incidentally, it is only recently that Argentina has had economic data, such as for inflation, that could be vaguely trusted, so they are only just in the Truth Recovery phase.

Cheating is the easiest way to boost results. Teachers can look at the questions some days before the test, and do a crash course in “revision” for the class. This makes teachers, children, parents and governments happy. PISA says it has methods to ensure security and detect cheating, but Heiner Rindermann also has his own ability to look carefully at PISA’s published results, and rejects some of them on the grounds of improbability.

Anatoly Karlin also had a look at the dataset and discussed the disappointing performance of China and other eastern countries, with Russia doing better. Get his full account here:

I wondered how big the effect of such selective non-attendance on the examinations might be. There is also the confounder that age at ending secondary education varies between one country and another, so that must be factored into the equation.

Emil Kirkegaard suggested an approach, and after discussions with me and Gerhard Meisenberg, sorted it out quickly. Have a look at the full process here:

Emil had also asked Heiner Rindermann to comment, and he came in a few minutes later, with a detailed publication (not yet published, so I cannot show it to you) and a rule of thumb adjustment you can apply to all the countries.

Heiner says:

School attendance rate of 15 year old youth (usually, but not always,  given in PISA reports, usually somewhere at the end).
Do not confuse with participation rate in PISA study.

Per each percent point not attending school subtract 1.5 SASQ points (equivalent 0.225 IQ points). That is a rule of thumb.

I have made a smaller correction for countries at low ability levels - in such countries pupils in school do not learn much.

Not bad for a few hours of internet time.

A few hours later, Steve Sailer had further and better particulars on the results:

So, where does this leave us with the PISA results? First, it gives me a chance to quote myself, one of the consolations of a lonely blogger: “Nobody gets round sampling theory, not even the Spanish Inquisition.” 

Second, and arising from the quote, the consequence is that the PISA results are only generalizable if the sample is a fair selection of the relevant group. In my view, to understand the abilities of a nation, the relevant group should be the entire age cohort. If many 15 year olds have already left school then a school sample will always be a partial indicator of a nation, and will very probably flatter it. This is because weaker students find school frustrating and leave, whereas the brighter ones enjoy studying, understand its long term benefits, and stay in education as long as they can. Further, if teachers ensure that even among those still staying at school the weaker students fall discretely ill on the day of testing, then the results can be massaged upwards. Spotting weaker students is easy for teachers: they can quickly determine it from student questions, and more accurately determine it by marking their class test papers.

Third, I do not want to reject PISA results, because local examination results share many of the same problems. In any nation where some teenagers leave school early the local examination results will be better than the actual national average. Equally, if within a school cohort not everyone takes the same national examination, the same flattering distortion takes place.

Fourth and finally, I think it best to study PISA results once they have been corrected to account for incomplete age cohorts in the Rindermann fashion, or in some elaboration and refinement of that technique.  Absent that, they have a large error term and present too rosy a picture of national scholastic attainments.


  1. ...usually, but not always, given in PISA reports, usually somewhere at the end

    pp.400, in the current report.

    Anyhow, this is very useful data. Overall, it will further accentuate the gap between the First World and the developing one - in practice, pretty much all of Europe (inc. the ex-USSR) is at around 90%; whereas the lower IQ nations (and China and Vietnam) are substantially below 80%.

    I have made a smaller correction for countries at low ability levels - in such countries pupils in school do not learn much.

    I just had this idea.

    In Hive Mind, I recall Garrett Jones suggesting that high concentrations of high IQs can drive even higher IQs, at least functionally (I suppose an analogy can be made with an individual fish in a school getting propelled forwards not so much by its own efforts but by the water motion generated by the other fish all around it). You certainly see this sort of thing in many elite cognitive domains - for instance, Anatoly Karpov played his best chess not when he was dominant, but when he was trying to beat the world championship back from Gary Kasparov.

    Could this partially or even wholly cancel out the exclusion effect?

    1. Anatoly, thanks for your comment. I think that competition can spur on greater achievements if there is a critical mass of other thinkers. In chess there is a game tradition, and grandmasters to assist with preparation. Same at Los Alamos, where the young Richard Feynman was on hand to critique the old masters of nuclear physics. So, it is always worth getting the best minds together to work on the hardest problems.

  2. My first and only experience of a teacher cheating about an exam was as a fresher. In his last lecture of the year, a mathematician said that he wanted to revise a topic with us. He then covered something that he hadn't taught us. That very topic cropped up in the examination later.

    I quizzed my classmates to find out who had seen through the trick. Most had not seen the thrust, but a sizeable minority had recognised it as trick that their own schoolteachers had played. Sadly, I must inform you that most of the people familiar with the cheating came from the more famous schools.

  3. Lower standard can really distort the outcome also. In high school, I was just average student at math with relatively easy testing questions. But I became prize winner at math competition when the testing question difficulty were so hard that most high school teachers had hard time to solve them.

    When standard is low, the scoring difference is more about diligence and prudence than mental ability.

    Blaming it on cheating is really cheap shot or sour grape.


  4. Very informative article which is about the psychological thriller and i must bookmark it, keep posting interesting articles.
    psychological thriller

  5. Much has been said about cheating in high school. Interestingly the OECD TALIS project did have some
    data on cheating in schools, i.e.

    TALIS 2013 Results: An International Perspective on Teaching and Learning
    Table 2.20.Web School climate - Frequency of student-related factors

    Sample summary data

    Ncountry=34; Ncheat=2087; Nschool=8645;
    Pct=pct of principals handling monthly, weekly or daily cheating reports

    Pct Country
    80.20 Netherlands
    56.10 Latvia
    50.90 Italy
    50.90 Estonia
    41.80 UnitedStates
    38.00 Brazil
    37.10 France
    33.50 Spain
    30.30 Israel
    28.40 BelgiumFlanders
    27.80 Portugal
    26.60 Average
    23.60 Mexico
    23.40 Finland
    18.40 CanadaAlberta
    16.80 Sweden
    14.90 Denmark
    11.00 UAEAbuDhabi
    8.50 Australia
    1.50 UnitedKingdomEngland
    1.40 Singapore
    1.40 Korea
    0.80 Iceland
    0.60 Japan

  6. Re: school attendance rate

    This include those that are home schooled who are often extremely bright.

    Re: PISA participation rate

    There were some students who participated but their scores were not given presumably not considered. Some 2012 sample data I got for student in various grades,

    Country, Pct, Math, Sci, Read
    G9 G10 G11 G12 |G9 G10 G11 G12 |G9 G10 G11 G12 |G9 G10 G11 G12
    Finland 84.97 None 0.15 None |528 None None None |555 None None None |533 None None None
    Shanghai 39.56 54.19 0.58 0.08 |605 631 None None |575 593 None None |564 583 None None

    For some reasons the scores for G11 and G12 were not given.

    1. Re: school attendance rate

      ... exclude

      Re: PISA participation rate

      Compared to those for example Isreal where scores for pct less than 1% was given,

      Israel 17.14 81.75 0.79 None 443 472 461 None 452 474 464 None 463 491 462 None

  7. This morning I woke up with a message in my head, "there is no second curve". I did not recall any bad dream. Yesterday most of my free time was on wrangling with the TALIS data. That did not seem to be it.

    Then I realized that my sub-concious was telling me to look at the above graph again. It was about truncated normal curve. If the left hand portion is truncated, this will shift the pop mean to the right. However, the second distribution curve is imaginary when forcing a second normal curve with the truncated data. The real curve is still the original truncated normal curve. The areas (values) between any
    intervals are still better estimated from the original curve. The graph above might given an impression that the whole pop distribution moved. What people normally shown is a second vertical line indicating the new
    (or related) value without the second imaginary curve, e.g.

    and the accompany corrections if the normality assumption is still valid.

    If there are worries about truncated distribution for some country, it might be better to use the smart fraction (e.g. for math level 5 proficiency 606.99) cutoff value which is fixed and determined from the global sample and particualar country value have little influence over it. It is also easier to comprehend the change as they are on the same unit (percentages)where as with the mean we have to juggle with percentage and quantity. For example if 10% of the left tail is truncated, the smart fraction simply increase by the ratio 1/.9=1.11 . Simple.

    1. Agree that using set proficiency levels to find the smart fraction is a very useful check on the implied mean of the population. Thanks for this suggested, which to my mind gets round the truncated normal curve issue.

  8. It seems that the survey has less problems with the partners rather than the
    OECD members. The process seems to be selection of initial schools,
    replacements for school dropouts, and the student participation in the
    final selected schools. Schools can dropout in the first two stages but students MIA only in the final stage.

    Which country, if any, is gaming the system?

    Table A2.3 Response rates (Page 297)
    country %InitSch %SchAftRepl %StuPartAftSchRepl
    DE 96 99 93
    DK 90 92 89
    EE 100 100 93
    FI 100 100 93
    FR 91 94 88
    UK 84 93 89
    US 67 83 90
    BSJG_CN 88 100 97
    HK 75 90 93
    JP 94 99 97
    KR 100 100 99
    RU 99 99 97
    SG 97 98 93
    TW 100 100 98