Counting significant hits was always a dumb way to measure progress in genomic prediction of a trait. Breeders using animals and plants never bothered with this approach and they used ridge regression for best predictive power (a “two cultures” problem no doubt). Researchers in human genetics are starting to catch up, implementing clever Enet approach for array datafiles (Qian et al 2019, called snpnet, based on glmnet). We are still waiting for this method to be widely used. It is possible to do summary statistics based Enet too (Mak et al 2017, called lassosum), but again, not many have done it yet.
That being said, we still see a lot of progress owing to larger datasets and some improvements in using the output from ‘single-variant-at-a-time’ regression that they use in regular GWASs. A brief summary. I focus on the TEDS sample (a bug UK twin sample with good DNA and cognitive testing) because this is the largest dataset not used to train GWASs with that has great cognitive testing. It’s someone could use the new subset of UK Biobank with improved cognitive testing to replicate the below (Cox et al 2019, n=29k).
Davies, G., Marioni, R. E., Liewald, D. C., Hill, W. D., Hagenaars, S. P., Harris, S. E., … & Cullen, B. (2016). Genome-wide association study of cognitive functions and educational attainment in UK Biobank (N= 112 151). Molecular psychiatry, 21(6), 758.
- Polygenic score analyses indicate that up to 5% of the variance in cognitive test scores can be predicted in an independent cohort.
Selzam, S., Krapohl, E., von Stumm, S., O’Reilly, P. F., Rimfeld, K., Kovas, Y., … & Plomin, R. (2017). Predicting educational achievement from DNA. Molecular psychiatry, 22(2), 267.
- We found that EduYears GPS explained greater amounts of variance in educational achievement over time, up to 9% at age 16, accounting for 15% of the heritable variance. This is the strongest GPS prediction to date for quantitative behavioral traits.
- Not quite intelligence, but closer to intelligence (g) than to educational attainment.
Krapohl, E., Patel, H., Newhouse, S., Curtis, C. J., von Stumm, S., Dale, P. S., … & Plomin, R. (2018). Multi-polygenic score approach to trait prediction. Molecular psychiatry, 23(5), 1368.
- The MPS approach predicted 10.9% variance in educational achievement, 4.8% in general cognitive ability and 5.4% in BMI in an independent test set, predicting 1.1%, 1.1%, and 1.6% more variance than the best single-score predictions.
Allegrini, A. G., Selzam, S., Rimfeld, K., von Stumm, S., Pingault, J. B., & Plomin, R. (2019). Genomic prediction of cognitive traits in childhood and adolescence. Molecular psychiatry, 24(6), 819.
- In a representative UK sample of 7,026 children at ages 12 and 16, we show that we can now predict up to 11% of the variance in intelligence and 16% in educational achievement.
- As above, educational achievement.
As it so happens, there is a paper for each year, letting one see a kind of 4 year progress.
Important caveat of the above! These predictions are not done on sibling pairs. When they are (Selzam et al 2019), the validity is ~50% reduced. This indicates some kind of training problem with the GWASs which either train on family related variance, detect population structure and use that, or something more complicated.