Predicting International Olympiad in Informatics results from contestant photos via face recognition

The 0.3-0.4 correlation between brain size and IQ is well known among the neuroscience community. I had also long realized that brain size could be very crudely approximated from photos. It had occurred to me that given photo data labeled with IQ or some proxy for it, one would likely find a statistically significant correlation. After looking at International Olympiad in Informatics (which selects 4 high schoolers each year from each country) results on the official site, which includes photos of the contestants, it occurred to me that I could do a data science project based on this.

I initially hesitated due to the controversy that such a project might generate, but after seeing that the face recognition service of this Chinese company even estimates “attractiveness”, which like it or not, does objectively exist in a statistical sense, I decided it would be okay. Besides that, two non-Chinese friends of mine had both showed me similar services of non Chinese origin: both based on photo and face recognition, one was an ethnicity estimator and the other estimated attractiveness, IQ, etc.

I wrote code to scrape the data, along with code using perhaps the most popular open source face recognition library in Python and OpenCV to pick out locations of top of head, eyebrows, and chin and then vertically align the image per a calibrated standard. Face recognition and especially top of head recognition by a relative computer vision noob like myself was imperfect, so I also made some manual adjustments. The bad photos were excluded, and those with an excess of hair were also manually tuned in their feature parameters, or excluded in some cases. And it turned out that

a very crude estimate for brain size correlated 0.09 with percentile represented via Z score on IOI at sample size almost 1200. The p-value of this was less than 0.001=0.1%, which is pretty statistically significant. Though a small correlation, it is much higher than the IQ correlation between unrelated children reared together, which is only 0.04 at adulthood. And midway though the project, I realized that eyes would very likely make a better predictor than eyebrows, given that there is actually quite some variation in eyebrow to eye distance among the entire human population. I would not be surprised the use of eyes in place of eyebrows yielded a correlation higher than 0.15.

I won’t release the code and data publicly at this point (and might never will) though it can be requested via email to gmachine1729 at I know/knew personally some IOI contestants myself and would also be interested in feedback from them on this. I will release publicly here a few graphs.

I told a biologist about this project too, and he was somewhat to my relief not surprised at all, regarding the genes, shared-environment, non-shared environment stuff to be pretty basic knowledge for genetics research, and also the correlation between brain size and intelligence to be well known. To be fair, “big brains” as a metaphor for smarts I have heard used multiple times in America too. This is not observed only across people but also in each individual person, like there is a reason why many things I found difficult at age 20 I now find pretty routine. And I also am like Professor Robert Plomin in accordance with the idea that most parents overestimate the effect of what they did for the education of their children.

