{"id":4648,"date":"2015-01-19T22:06:40","date_gmt":"2015-01-19T21:06:40","guid":{"rendered":"http:\/\/emilkirkegaard.dk\/en\/?p=4648"},"modified":"2015-01-19T22:06:40","modified_gmt":"2015-01-19T21:06:40","slug":"admixture-in-the-americas-admixture-among-us-blacks-and-hispanics-and-academic-achievement","status":"publish","type":"post","link":"https:\/\/emilkirkegaard.dk\/en\/2015\/01\/admixture-in-the-americas-admixture-among-us-blacks-and-hispanics-and-academic-achievement\/","title":{"rendered":"Admixture in the Americas: Admixture among US Blacks and Hispanics and academic achievement"},"content":{"rendered":"<p>Some time ago a new paper came out from the 23andme people reporting admixture among US ethnoracial groups (Bryc et al, 2014). Per our still on-going admixture project (current draft here), one could see if admixture predicts academic achievement (or IQ, if such were available). We (that is, John did) put together achievement data (reading and math scores) from the <a href=\"https:\/\/en.wikipedia.org\/wiki\/National_Assessment_of_Educational_Progress\">NAEP<\/a> and the admixture data <a href=\"https:\/\/docs.google.com\/spreadsheets\/d\/1NwIc9hXIVhvu08hFovjqgaa8KUT7plKJqCcGiReZ56Q\/edit#gid=136840517\">here<\/a>.<\/p>\n<p><strong>Descriptive stats<\/strong><\/p>\n<p>Admixture studies do not work well if there is no or little variation within groups. So let&#8217;s first examine them. For blacks:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">                      vars  n mean   sd median trimmed  mad  min  max range  skew kurtosis   se\r\nBlackAfricanAncestry     1 31 0.74 0.04   0.74    0.74 0.03 0.64 0.83  0.19 -0.03    -0.38 0.01\r\nBlackEuropeanAncestry    1 31 0.23 0.04   0.24    0.23 0.03 0.15 0.34  0.19  0.09    -0.30 0.01<\/pre>\n<p>&nbsp;<\/p>\n<p>So we see that there is little American admixture in Blacks because the African and European add up to close to 100 (23+74=97). In fact, the correlation between African and European ancestry in Blacks is -.99. This also means that multiple correlation is useless because of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Multicollinearity\">collinearity<\/a>.<\/p>\n<p>White admixture data is also not very useful. It is almost exclusively European:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">                      vars  n mean sd median trimmed mad  min max range  skew kurtosis se\r\nWhiteEuropeanAncestry    1 51 0.99  0   0.99    0.99   0 0.98   1  0.02 -0.95     0.74  0<\/pre>\n<p>What about Hispanics (some sources call them Latinos)?<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">                       vars  n mean   sd median trimmed  mad  min  max range skew kurtosis   se\r\nLatinoEuropeanAncestry    1 34 0.73 0.07   0.72    0.73 0.05 0.57 0.90  0.33 0.34     0.22 0.01\r\nLatinoAfricanAncestry     1 34 0.09 0.05   0.08    0.08 0.06 0.01 0.22  0.21 0.51    -0.69 0.01\r\nLatinoAmericanAncestry    1 34 0.10 0.05   0.09    0.10 0.03 0.04 0.21  0.17 0.80    -0.47 0.01<\/pre>\n<p>Hispanics are fairly admixed. Overall, they are mostly European, but the range of African and American ancestry is quite high. Furthermore, due to the three way variation, multiple regression should work. The ancestry intercorrelations are: -.42 (Afro x Amer) -.21 (Afro x Euro) -.50 (Amer x Euro). There must also be another source because 73+9+10 is only 92%. Where&#8217;s the last 8% admixture from?<\/p>\n<p><strong>Admixture x academic achievement correlations: Blacks<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<td id=\"origin\"><\/td>\n<th>row.names<\/th>\n<th>BlackAfricanAncestry<\/th>\n<th>BlackAmericanAncestry<\/th>\n<th>BlackEuropeanAncestry<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"rn\">1<\/td>\n<td>Math2013B<\/td>\n<td>-0.32<\/td>\n<td>0.09<\/td>\n<td>0.29<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">2<\/td>\n<td>Math2011B<\/td>\n<td>-0.27<\/td>\n<td>0.21<\/td>\n<td>0.25<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">3<\/td>\n<td>Math2009B<\/td>\n<td>-0.30<\/td>\n<td>0.09<\/td>\n<td>0.28<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">4<\/td>\n<td>Math2007B<\/td>\n<td>-0.12<\/td>\n<td>0.27<\/td>\n<td>0.08<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">5<\/td>\n<td>Math2005B<\/td>\n<td>-0.28<\/td>\n<td>0.26<\/td>\n<td>0.23<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">6<\/td>\n<td>Math2003B<\/td>\n<td>-0.30<\/td>\n<td>0.15<\/td>\n<td>0.26<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">7<\/td>\n<td>Math2000B<\/td>\n<td>-0.36<\/td>\n<td>-0.08<\/td>\n<td>0.34<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">8<\/td>\n<td>Read2013B<\/td>\n<td>-0.25<\/td>\n<td>0.14<\/td>\n<td>0.22<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">9<\/td>\n<td>Read2011B<\/td>\n<td>-0.33<\/td>\n<td>0.22<\/td>\n<td>0.30<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">10<\/td>\n<td>Read2009B<\/td>\n<td>-0.40<\/td>\n<td>-0.03<\/td>\n<td>0.41<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">11<\/td>\n<td>Read2007B<\/td>\n<td>-0.26<\/td>\n<td>0.14<\/td>\n<td>0.24<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">12<\/td>\n<td>Read2005B<\/td>\n<td>-0.43<\/td>\n<td>0.33<\/td>\n<td>0.39<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">13<\/td>\n<td>Read2003B<\/td>\n<td>-0.42<\/td>\n<td>0.09<\/td>\n<td>0.38<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">14<\/td>\n<td>Read2002B<\/td>\n<td>-0.30<\/td>\n<td>-0.10<\/td>\n<td>0.27<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>Summarizing these results:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">     vars  n  mean   sd median trimmed  mad   min   max range  skew kurtosis   se\r\nAfro    1 14 -0.31 0.08  -0.30   -0.32 0.05 -0.43 -0.12  0.31  0.48     0.10 0.02\r\nAmer    1 14  0.13 0.13   0.14    0.13 0.11 -0.10  0.33  0.43 -0.32    -1.07 0.03\r\nEuro    1 14  0.28 0.08   0.28    0.29 0.06  0.08  0.41  0.33 -0.49     0.11 0.02<\/pre>\n<p>So we see the expected directions and order, for Blacks (who are mostly African), American admixture is positive and European is more positive. There is quite a bit of variation over the years. It is possible that this reflects mostly &#8216;noise&#8217; as in, e.g. changes in educational policies in the states, or just sampling error. It is also possible that the changes are due to admixture changes within states over time.<\/p>\n<p><strong>Admixture x academic achievement correlations: Hispanics<\/strong><\/p>\n<table>\n<thead>\n<tr>\n<td id=\"origin\"><\/td>\n<th>row.names<\/th>\n<th>LatinoAfricanAncestry<\/th>\n<th>LatinoAmericanAncestry<\/th>\n<th>LatinoEuropeanAncestry<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td class=\"rn\">1<\/td>\n<td>Math13H<\/td>\n<td>0.20<\/td>\n<td>-0.13<\/td>\n<td>-0.10<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">2<\/td>\n<td>Math11H<\/td>\n<td>0.27<\/td>\n<td>0.02<\/td>\n<td>-0.02<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">3<\/td>\n<td>Math09H<\/td>\n<td>0.29<\/td>\n<td>-0.32<\/td>\n<td>0.04<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">4<\/td>\n<td>Math07H<\/td>\n<td>0.36<\/td>\n<td>-0.14<\/td>\n<td>-0.01<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">5<\/td>\n<td>Math05H<\/td>\n<td>0.38<\/td>\n<td>-0.08<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">6<\/td>\n<td>Math03H<\/td>\n<td>0.37<\/td>\n<td>-0.23<\/td>\n<td>-0.08<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">7<\/td>\n<td>Math00H<\/td>\n<td>0.30<\/td>\n<td>-0.09<\/td>\n<td>-0.05<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">8<\/td>\n<td>Read2013H<\/td>\n<td>0.18<\/td>\n<td>-0.44<\/td>\n<td>0.33<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">9<\/td>\n<td>Read2011H<\/td>\n<td>0.21<\/td>\n<td>-0.26<\/td>\n<td>0.33<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">10<\/td>\n<td>Read2009H<\/td>\n<td>0.19<\/td>\n<td>-0.44<\/td>\n<td>0.33<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">11<\/td>\n<td>Read2007H<\/td>\n<td>0.13<\/td>\n<td>-0.32<\/td>\n<td>0.23<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">12<\/td>\n<td>Read2005H<\/td>\n<td>0.38<\/td>\n<td>-0.30<\/td>\n<td>0.23<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">13<\/td>\n<td>Read2003H<\/td>\n<td>0.32<\/td>\n<td>-0.34<\/td>\n<td>0.18<\/td>\n<\/tr>\n<tr>\n<td class=\"rn\">14<\/td>\n<td>Read2002H<\/td>\n<td>0.24<\/td>\n<td>-0.23<\/td>\n<td>0.08<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>And summarizing:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">     vars  n  mean   sd median trimmed  mad   min  max range  skew kurtosis   se\r\nAfro    1 14  0.27 0.08   0.28    0.28 0.12  0.13 0.38  0.25 -0.10    -1.49 0.02\r\nAmer    1 14 -0.24 0.14  -0.24   -0.24 0.15 -0.44 0.02  0.46  0.17    -1.13 0.04\r\nEuro    1 14  0.11 0.16   0.06    0.11 0.19 -0.10 0.33  0.43  0.23    -1.68 0.04<\/pre>\n<p>We do <strong>not<\/strong> see the expected results per genetic model. Among Hispanics who are 73% European, African admixture has a <em>positive<\/em> relationship to academic achievement. American admixture is negatively correlated and European positively, but weaker than African. The only thing that&#8217;s in line with the genetic model is that European is positive. On the other hand, results are not in line with a null model either, because then we were expecting results to fluctuate around 0.<\/p>\n<p>Note that the European admixture numbers are only positive for the reading tests. The reading tests are presumably those mostly affected by language bias (many Hispanics speak Spanish as a first language). If anything, the math results are worse for the genetic model.<\/p>\n<p><strong>General achievement factors<\/strong><\/p>\n<p>We can eliminate some of the noise in the data by extracting a general achievement factor for each group. I do this by first removing the cases with no data at all, and then imputing the rest.<\/p>\n<p>Then we get the correlation like before. This should be fairly close to the means above:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\"> LatinoAfricanAncestry LatinoAmericanAncestry LatinoEuropeanAncestry \r\n                  0.28                  -0.36                   0.22<\/pre>\n<p>The European result is stronger with the general factor from the imputed dataset, but the order is the same.<\/p>\n<p>We can do the same for the Black data to see if the imputation+factor analysis screws up the results:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\"> BlackAfricanAncestry BlackAmericanAncestry BlackEuropeanAncestry \r\n                -0.35                  0.20                  0.31<\/pre>\n<p>These results are similar to before (-.31, .13, .28) with the American result somewhat stronger.<\/p>\n<p><strong>Plotting<\/strong><\/p>\n<p>Perhaps if we plot the results, we can figure out what is going on. We can plot either the general achievement factor, or specific results. Let&#8217;s do both:<\/p>\n<p><em>Reading2013 plots<\/em><\/p>\n<p><a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_afro_read13.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-4655\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_afro_read13-300x185.png\" alt=\"hispanic_afro_read13\" width=\"300\" height=\"185\" \/><\/a> <a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_amer_read13.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-4656 size-medium\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_amer_read13-300x185.png\" alt=\"hispanic_amer_read13\" width=\"300\" height=\"185\" \/><\/a> <a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_euro_read13.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-4657 size-medium\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_euro_read13-300x185.png\" alt=\"hispanic_euro_read13\" width=\"300\" height=\"185\" \/><\/a><\/p>\n<p><em>Math2013 plots<\/em><\/p>\n<p><a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_afro_math13.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-4658\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_afro_math13-300x185.png\" alt=\"hispanic_afro_math13\" width=\"300\" height=\"185\" \/><\/a> <a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_amer_math13.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-4659\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_amer_math13-300x185.png\" alt=\"hispanic_amer_math13\" width=\"300\" height=\"185\" \/><\/a> <a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_euro_math13.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-4660\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_euro_math13-300x185.png\" alt=\"hispanic_euro_math13\" width=\"300\" height=\"185\" \/><\/a><\/p>\n<p><em>General factor plots<\/em><\/p>\n<p><a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_afro_general.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-4661\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_afro_general-300x185.png\" alt=\"hispanic_afro_general\" width=\"300\" height=\"185\" \/><\/a> <a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_amer_general.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-4662\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_amer_general-300x185.png\" alt=\"hispanic_amer_general\" width=\"300\" height=\"185\" \/><\/a> <a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_euro_general.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-4663\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_euro_general-300x185.png\" alt=\"hispanic_euro_general\" width=\"300\" height=\"185\" \/><\/a><\/p>\n<p>These did not help me understand it. Maybe they make more sense to someone who understands US demographics and history better.<\/p>\n<p><strong>Multiple regression<\/strong><\/p>\n<p>As mentioned above, the Black data should be mostly useless for multiple regression due to high collinearity. But the hispanic should be better. I ran models using two of the three ancestry estimates at a time since one cannot use all three (I think).<\/p>\n<p>Generally, the independents did not reach significance. Using the general achievement factor as the dependent, the standardized betas are:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">LatinoAfricanAncestry LatinoAmericanAncestry\r\n             0.1526765             -0.2910413<\/pre>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">LatinoAfricanAncestry LatinoEuropeanAncestry\r\n             0.3363636              0.2931108<\/pre>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">LatinoAmericanAncestry LatinoEuropeanAncestry\r\n           -0.32474678             0.06224425<\/pre>\n<p>The first is relative to European, second to American, and third African. The results are not even consistent with each other. In the first, African&gt;European. In the third, European&gt;African. All results show that Others&gt;American tho.<\/p>\n<p><strong>The remainder<\/strong><\/p>\n<p>There is something odd about the data, it doesn&#8217;t sum to 1. I calculated the sum of the ancestry estimates, and then subtracted that from 1. Here&#8217;s the results:<\/p>\n<p><a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/black_remainder.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-4664\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/black_remainder-300x185.png\" alt=\"black_remainder\" width=\"300\" height=\"185\" \/><\/a> <a href=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_remainder.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-4665\" src=\"http:\/\/emilkirkegaard.dk\/en\/wp-content\/uploads\/hispanic_remainder-300x185.png\" alt=\"hispanic_remainder\" width=\"300\" height=\"185\" \/><\/a><\/p>\n<p>To these we can add simple descriptive stats:<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">                        vars  n mean   sd median trimmed  mad  min  max range skew kurtosis   se\r\nBlackRemainderAncestry     1 31 0.02 0.00   0.02    0.02 0.00 0.01 0.03  0.02 1.35     1.18 0.00\r\nLatinoRemainderAncestry    1 34 0.08 0.05   0.07    0.07 0.03 0.02 0.34  0.32 3.13    12.78 0.01<\/pre>\n<p>&nbsp;<\/p>\n<p>So we see that there is a sizable other proportion of Hispanics and a small one for Blacks. Presumably, the large outlier of Hawaii is Asian admixture from Japanese, Chinese, Filipino and Native Hawaiian clusters. At least, these are the largest groups <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hawaii#Demographics\">according to Wikipedia<\/a>. For Blacks, the ancestry is presumably Asian admixture as well.<\/p>\n<p>Do these remainders correlate with academic achievement? For Blacks, r = .39 (p = .03), and for Hispanics r = -.24 (p = .18). So the direction is as expected for Blacks and stronger, but for Hispanics, it is in the right direction but weaker.<\/p>\n<p><strong>Partial correlations<\/strong><\/p>\n<p>What about partialing out the remainders?<\/p>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">LatinoAfricanAncestry LatinoAmericanAncestry LatinoEuropeanAncestry\r\n            0.21881404            -0.33114612             0.09329413<\/pre>\n<pre id=\"rstudio_console_output\" class=\"GEWYW5YBFEB\" tabindex=\"0\">BlackAfricanAncestry BlackAmericanAncestry BlackEuropeanAncestry\r\n           -0.2256171             0.1189219             0.2185139<\/pre>\n<p>&nbsp;<\/p>\n<p>Not much has changed. European correlation has become weaker for Hispanics. For Blacks, results are similar to before.<\/p>\n<p><strong>Proposed explanations?<\/strong><\/p>\n<p>The African results are in line with genetic models. The Hispanic is not, but it isn&#8217;t in line with the null-model either. Perhaps it has something to do with generational effects. Perhaps if one could find % of first generation Hispanics by state and add those to the regression model \/ control for that using partial correlations.<\/p>\n<p>Other ideas? Before calculating the results, John wrote:<\/p>\n<blockquote><p>Language, generation, and genetic assimilation are all confounded, so I thought it best to not look at them.<\/p><\/blockquote>\n<p>He may be right.<\/p>\n<p><strong>R code<\/strong><\/p>\n<pre>data = read.csv(\"BryceAdmixNAEP.tsv\", sep=\"\\t\",row.names=1)\r\nlibrary(car) # for vif\r\nlibrary(psych) # for describe\r\nlibrary(VIM) # for imputation\r\nlibrary(QuantPsyc) #for lm.beta\r\nlibrary(devtools) #for source_url\r\n#load mega functions\r\nsource_url(\"https:\/\/osf.io\/project\/zdcbq\/osfstorage\/files\/mega_functions.R\/?action=download\")\r\n\r\n#descriptive stats\r\n#blacks\r\nrbind(describe(data[\"BlackAfricanAncestry\"]),\r\ndescribe(data[\"BlackEuropeanAncestry\"]))\r\n#whites\r\ndescribe(data[\"WhiteEuropeanAncestry\"])\r\n#hispanics\r\nrbind(describe(data[\"LatinoEuropeanAncestry\"]),\r\n\u00a0\u00a0\u00a0\u00a0\u00a0 describe(data[\"LatinoAfricanAncestry\"]),\r\n\u00a0\u00a0\u00a0\u00a0\u00a0 describe(data[\"LatinoAmericanAncestry\"]))\r\n\r\n##Regressions\r\n#Blacks\r\nblack.model = \"Math2013B ~ BlackAfricanAncestry+BlackAmericanAncestry\"\r\nblack.model = \"Read2013B ~ BlackAfricanAncestry+BlackAmericanAncestry\"\r\nblack.model = \"Math2013B ~ BlackAfricanAncestry+BlackEuropeanAncestry\"\r\nblack.model = \"Read2013B ~ BlackAfricanAncestry+BlackEuropeanAncestry\"\r\nblack.fit = lm(black.model, data)\r\nsummary(black.fit)\r\n\r\n#Hispanics\r\nhispanic.model = \"Math2013H ~ LatinoAfricanAncestry+LatinoAmericanAncestry\"\r\nhispanic.model = \"Read2013H ~ LatinoAfricanAncestry+LatinoAmericanAncestry\"\r\nhispanic.model = \"Math2013H ~ LatinoAfricanAncestry+LatinoEuropeanAncestry\"\r\nhispanic.model = \"Read2013H ~ LatinoAfricanAncestry+LatinoEuropeanAncestry\"\r\nhispanic.model = \"hispanic.ach.factor ~ LatinoAfricanAncestry+LatinoAmericanAncestry\"\r\nhispanic.model = \"hispanic.ach.factor ~ LatinoAfricanAncestry+LatinoEuropeanAncestry\"\r\nhispanic.model = \"hispanic.ach.factor ~ LatinoAmericanAncestry+LatinoEuropeanAncestry\"\r\nhispanic.model = \"hispanic.ach.factor ~ LatinoAfricanAncestry+LatinoAmericanAncestry+LatinoEuropeanAncestry\"\r\nhispanic.fit = lm(hispanic.model, data)\r\nsummary(hispanic.fit)\r\nlm.beta(hispanic.fit)\r\n\r\n##Correlations\r\ncors = round(rcorr(as.matrix(data))$r,2) #all correlations, round to 2 decimals\r\n\r\n#blacks\r\nadmixture.cors.black = cors[10:23,1:3] #Black admixture x Achv.\r\nhist(unlist(admixture.cors.black[,1])) #hist for afri x achv\r\nhist(unlist(admixture.cors.black[,2])) #amer x achv\r\nhist(unlist(admixture.cors.black[,3])) #euro x achv\r\ndesc = rbind(Afro=describe(unlist(admixture.cors.black[,1])), #descp. stats afri x achv\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Amer=describe(unlist(admixture.cors.black[,2])), #amer x achv\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Euro=describe(unlist(admixture.cors.black[,3]))) #euro x achv\r\n\r\n#whites\r\nadmixture.cors.white = cors[24:25,4:6] #White admixture x Achv.\r\n\r\n#hispanics\r\nadmixture.cors.hispanic = cors[26:39,7:9] #White admixture x Achv.\r\ndesc = rbind(Afro=describe(unlist(admixture.cors.hispanic[,1])), #descp. stats afri x achv\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Amer=describe(unlist(admixture.cors.hispanic[,2])), #amer x achv\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Euro=describe(unlist(admixture.cors.hispanic[,3]))) #euro x achv\r\n\r\n##Examine hispanics by scatterplots\r\n#Reading\r\nscatterplot(Read2013H ~ LatinoAfricanAncestry, data,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 smoother=FALSE, id.n=nrow(data))\r\nscatterplot(Read2013H ~ LatinoEuropeanAncestry, data,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 smoother=FALSE, id.n=nrow(data))\r\nscatterplot(Read2013H ~ LatinoAmericanAncestry, data,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 smoother=FALSE, id.n=nrow(data))\r\n#Math\r\nscatterplot(Math2013H ~ LatinoAfricanAncestry, data,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 smoother=FALSE, id.n=nrow(data))\r\nscatterplot(Math2013H ~ LatinoEuropeanAncestry, data,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 smoother=FALSE,id.n=nrow(data))\r\nscatterplot(Math2013H ~ LatinoAmericanAncestry, data,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 smoother=FALSE,id.n=nrow(data))\r\n#General factor\r\nscatterplot(hispanic.ach.factor ~ LatinoAfricanAncestry, data,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 smoother=FALSE, id.n=nrow(data))\r\nscatterplot(hispanic.ach.factor ~ LatinoEuropeanAncestry, data,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 smoother=FALSE,id.n=nrow(data))\r\nscatterplot(hispanic.ach.factor ~ LatinoAmericanAncestry, data,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 smoother=FALSE,id.n=nrow(data))\r\n\r\n##Imputed and aggregated data\r\n#Hispanics\r\nhispanic.ach.data = data[26:39] #subset hispanic ach data\r\nhispanic.ach.data = hispanic.ach.data[miss.case(hispanic.ach.data)&lt;ncol(hispanic.ach.data),] #remove empty cases\r\nmiss.table(hispanic.ach.data) #examine missing data\r\nhispanic.ach.data = irmi(hispanic.ach.data, noise.factor = 0) #impute the rest\r\n#factor analysis\r\nfact.hispanic = fa(hispanic.ach.data) #get common ach factor\r\nfact.scores = fact.hispanic$scores; colnames(fact.scores) = \"hispanic.ach.factor\"\r\ndata = merge.datasets(data,fact.scores,1) #merge it back into data\r\ncors[7:9,\"hispanic.ach.factor\"] #results for general factor\r\n\r\n#Blacks\r\nblack.ach.data = data[10:23] #subset black ach data\r\nblack.ach.data = black.ach.data[miss.case(black.ach.data)&lt;ncol(black.ach.data),] #remove empty cases\r\nblack.ach.data = irmi(black.ach.data, noise.factor = 0) #impute the rest\r\n#factor analysis\r\nfact.black = fa(black.ach.data) #get common ach factor\r\nfact.scores = fact.black$scores; colnames(fact.scores) = \"black.ach.factor\"\r\ndata = merge.datasets(data,fact.scores,1) #merge it back into data\r\ncors[1:3,\"black.ach.factor\"] #results for general factor\r\n\r\n##Admixture totals\r\n#Hispanic\r\nHispanic.admixture = subset(data, select=c(\"LatinoAfricanAncestry\",\"LatinoAmericanAncestry\",\"LatinoEuropeanAncestry\"))\r\nHispanic.admixture = Hispanic.admixture[miss.case(Hispanic.admixture)==0,] #complete cases\r\nHispanic.admixture.sum = data.frame(apply(Hispanic.admixture, 1, sum))\r\ncolnames(Hispanic.admixture.sum)=\"Hispanic.admixture.sum\" #fix name\r\ndescribe(Hispanic.admixture.sum) #stats\r\n\r\n#add data back to dataframe\r\nLatinoRemainderAncestry = 1-Hispanic.admixture.sum #get remainder\r\ncolnames(LatinoRemainderAncestry) = \"LatinoRemainderAncestry\" #rename\r\ndata = merge.datasets(LatinoRemainderAncestry,data,2) #merge back\r\n\r\n#plot it\r\nLatinoRemainderAncestry = LatinoRemainderAncestry[order(LatinoRemainderAncestry,decreasing=FALSE),,drop=FALSE] #reorder\r\ndotchart(as.matrix(LatinoRemainderAncestry),cex=.7) #plot, with smaller text\r\n\r\n#Black\r\nBlack.admixture = subset(data, select=c(\"BlackAfricanAncestry\",\"BlackAmericanAncestry\",\"BlackEuropeanAncestry\"))\r\nBlack.admixture = Black.admixture[miss.case(Black.admixture)==0,] #complete cases\r\nBlack.admixture.sum = data.frame(apply(Black.admixture, 1, sum))\r\ncolnames(Black.admixture.sum)=\"Black.admixture.sum\" #fix name\r\ndescribe(Black.admixture.sum) #stats\r\n\r\n#add data back to dataframe\r\nBlackRemainderAncestry = 1-Black.admixture.sum #get remainder\r\ncolnames(BlackRemainderAncestry) = \"BlackRemainderAncestry\" #rename\r\ndata = merge.datasets(BlackRemainderAncestry,data,2) #merge back\r\n\r\n#plot it\r\nBlackRemainderAncestry = BlackRemainderAncestry[order(BlackRemainderAncestry,decreasing=FALSE),,drop=FALSE] #reorder\r\ndotchart(as.matrix(BlackRemainderAncestry),cex=.7) #plot, with smaller text\r\n\r\n#simple stats for both\r\nrbind(describe(BlackRemainderAncestry),describe(LatinoRemainderAncestry))\r\n\r\n#make subset with remainder data and achievement\r\nremainders = subset(data, select=c(\"black.ach.factor\",\"BlackRemainderAncestry\",\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"hispanic.ach.factor\",\"LatinoRemainderAncestry\"))\r\nView(rcorr(as.matrix(remainders))$r) #correlations?\r\n\r\n#Partial correlations\r\npartial.r(data, c(7:9,40), c(43))[4,] #partial out remainder for Hispanics\r\npartial.r(data, c(1:3,41), c(42))[4,] #partial out remainder for Blacks<\/pre>\n<p><strong>References<\/strong><\/p>\n<p>Bryc, K., Durand, E. Y., Macpherson, J. M., Reich, D., &amp; Mountain, J. L. (2014). <a href=\"http:\/\/biorxiv.org\/content\/early\/2014\/09\/26\/009340\">The Genetic Ancestry of African Americans, Latinos, and European Americans across the United States.<\/a><span class=\"Apple-converted-space\">\u00a0<\/span><i>The American Journal of Human Genetics<\/i>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Some time ago a new paper came out from the 23andme people reporting admixture among US ethnoracial groups (Bryc et al, 2014). Per our still on-going admixture project (current draft here), one could see if admixture predicts academic achievement (or IQ, if such were available). We (that is, John did) put together achievement data (reading [&hellip;]<\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1690,1653],"tags":[2076,1992],"class_list":["post-4648","post","type-post","status-publish","format-standard","hentry","category-genetics","category-psychology","tag-academic-achievement","tag-admixture","entry"],"_links":{"self":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/4648","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/comments?post=4648"}],"version-history":[{"count":2,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/4648\/revisions"}],"predecessor-version":[{"id":4667,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/posts\/4648\/revisions\/4667"}],"wp:attachment":[{"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/media?parent=4648"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/categories?post=4648"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/emilkirkegaard.dk\/en\/wp-json\/wp\/v2\/tags?post=4648"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}