Polygenic scores are all the rage. Naturally, you want to know your own scores, and you have a reading of your genome from some company. Unfortunately, the company doesn’t provide you with interesting results? Fear not, you can download your ‘raw data’ (actually the genotypes from the SNP chip) and upload it to some other sites for more results. Probably the best current site is impute.me, but another site is dna.land. My genome has been public for years (since I am part of the opensnp sample), so below we will use me as the guinea pig. You can follow along using my id: id_4Y843C033. The most annoying thing about this site is that after uploading your data, it takes some days for the site to be ready (they have to impute it first, and apparently, their pipeline is slow). It took 8 days for me.
First, let’s see if it can get my ethnicity/race/ancestry right. The site is not as good as 23andme/ancestry.com for this, it simply plots you in 3d view of the principal components compared to the 1000 genomes. A good start:
There are mouse-over names for the groups. E.g. the blue ones are Africans (and admixed Africans, thus the cline between Europeans and Africans).
1000 genomes doesn’t have Scandinavians as such, but it has Utah Europeans, which are partly Scandinavian (15% according to self-report). I am in that cluster (compare with 23andme results in this post).
Second, we can try another easy one: height. The site actually uses the old Wood et al 2014 study, but maybe it is good enough.
Some of the pages provide centiles, others simply the scores. According to this, I am somewhat taller than average genetically speaking, which is perhaps true as Danes are quite tall among Europeans. I am pretty average height by Danish standards.
Third, one can look at the big collection of complex traits, mostly diseases. Most of the predictors here will be terrible and of little interest yet. However, a few of them are of interest. For instance, I have psoriasis, a skin disease:
So it gets it right! There is a half-decent GWAS for schizophrenia too (from 2014):
Phew! I actually have 2 relatives with schizophrenia. One grandmother, and one distant cousin of sorts.
Type 1 diabetes is easy to predict (very heritable, >90%, and most variants in a few places), even with a crude method (they use this 2015 GWAS). I don’t have this, so we can check:
The trait is quite rare, about 1% of the population, so being in the 57th centile means being essentially risk free. We can also check diabetes type 2, but the models for this are not very good, also because it is not very heritable (20% or so).
Uh oh! I don’t have this, but I could get it later… I don’t have any relatives with this, as far as I know.
Fourth, for the juicy parts, what about intelligence? The site does not appear to have this trait in their list. But wait! There is a secret page, unlisted, for this trait. It uses the Savage et al 2017 predictor (n=270k). This is a pure GWAS for intelligence, unlike Lee et al 2018 (n=1.1M), which used educational attainment, which results in a better predictor for intelligence, though probably more biased by familial confounding.
It doesn’t provide centiles, but at least my score is above average. Yay! The score does not appear to be standardized, but this looks like roughly +1 SD to me (my brother gets 2.4, father 2.2, mother 2.0, I think the true order is E>B>M>F, so the correlation is not entirely terrible). Lasse Folkersen (who is also Danish), explains why it is hidden on Reddit in 2019 (“lasse2”) on r/SlateStarCodex no less:
This is exactly the reason why the intelligence module is unlisted: people can’t handle it if they get a low score because they somehow take it more personal than a real IQ test. They are wrong. Like with hair and height, and all manner of other things measurable – go look in a mirror if you wanna know it. Take an IQ score test for example. If it’s high, then good for you. That’s the real measure. Of IQ. Which isn’t even the same as how good and smart person you are, but rather a specific measures of problem solving skills.
It even explicitly says this (/u/nutnate). These scores are the sum of the all known genetics. 1) genetics don’t explain all of IQ variation, obviously, and 2) we don’t know all genetics, also obviously. 7% has been quoted, and yet you somehow demand it to be exactly on point. How could it ever be? But, /u/gwern – I’m not saying you are not intelligent for not making that deduction; other smart people have made the same mistake, for example Carl Zimmer in this opinion piece last year. It is not a more personal IQ score. He too misunderstands that.
On the finer points raised: no the score is not switched (/u/4QHURikzXS). No we did not “forget” some SNPs (/u/gwern) ; rather it is a very open debate how exactly to do this: either one takes the top-SNPs 564 reported, or else one takes all independent SNPs in the genome including those with lower weights. The first is what you are looking at currently, it’s probably more robust against ethnicity effects (thanks for providing that discussion link /u/priscillajansen). The second is something being worked on with the PRS module. You can see all modules in the GitHub repository (/u/okatuska) – but yes it will melt your computer /u/misanthropokemon, because the imputation algorithms needed to get all SNPs for the PRS are that computationally intensive, /u/brberg and /u/The_Dar.
And the overall point – that I really really hope people will take away from this: genetics will never ever be “ready” to predict something, if that something is something we already know and can measure. It will always be better to just measure it. Anything else is determinism, with no basis in real evidence. That still leaves plenty of room for usability, because for example of prediction in all the things we don’t know yet. Like future disease states.
Finally, thank you to /u/werttrew for posting.
That completes the tour of the results worth looking at, though the site also provides some results on mutations, athletics, and curiously incorrect predictions on coloration (hair, eyes).