Review: Dataclysm: Who We Are (When We Think No One’s Looking) (Christian Rudder)

www.goodreads.com/book/show/21480734-dataclysm

gen.lib.rus.ec/book/index.php?md5=9d2c0744b6bcce6ec9e67625125244a8

This good is based on the popular but discontinued OKTrends blog, but now apparently active again becus of the book release. There is some more info in the book than can be found on the blog, but overall there is much more on the blog. The book is short (300 pp) and written in non-academic style with no statistical jargon. Read it if u think big data about humans is interesting. The author is generally negative about it, so if u are skeptical about it, u may like this book.

Review: G Is for Genes: The Impact of Genetics on Education and Achievement (Kathryn Asbury, Robert Plomin)

www.goodreads.com/book/show/17015094-g-is-for-genes

gen.lib.rus.ec/book/index.php?md5=97ac0ec914522d3c888679e9c02291c6

So i kept finding references to this book in papers, so i decided to read it. It is a quick read introducing behavior genetics and the results from it to lay readers and perhaps policy makers. The book is overly long (200) for its content, it cud easily have been cut 30 pages. The book itself contains not much new to people familiar with the field (i.e. me), however there are some references that were interesting and unknown to me. It may pay for the expert to simply skim the reference lists for each chapter and read those papers instead.

The main thrust of the book is what policies we shud implement becus of our ‘new’ behavioral genetic knowledge. Basically the authors think that we need to add more choice to schools becus everybody is different and we want to use the gene-environment correlations to improve results. It is hard to disagree with this. They go on about how labeling is bad, but obviously labeling is useful for talking about things.

If one is interested in school policy then reading this book may be worth it, especially if one is a layman. If one is interested in learning behavior genetics, read something else (e.g. Plomin’s 2012 textbook)

Review: Moral Tribes: Emotion, Reason, and the Gap Between Us and Them (Joshua D. Greene)

www.goodreads.com/book/show/17707599-moral-tribes

gen.lib.rus.ec/book/index.php?md5=08a6526a7a706c4ae5f5aa7543ffc702

Years ago when i used to study filosofy, i came across Joshua’s website. On the site i found his phd thesis which i read. It is probably the best meta-ethics writing ive come across. He seems to have removed it from the site “available by request”, however i still have it: Greene, J. D. (2002). The Terrible, Horrible, No Good, Very Bad Truth About Morality and What To Do About It. Anyway, this thesis is what apparently turned into the book. The book is clearly written for a mass market, so it has only a few notes and is very light on statistics. I think it is basically sound. The later chapters were somewhat annoying to read due to excessive repetition and unclear language. I suppose he added to appeal more to laymen and confused people.

In he introduction, he is so nice as to lay out the book:

In part 1 (“Moral Problems”), we’ll distinguish between the two major kinds of moral problems. The first kind is more basic. It’s the problem of Me versus Us: selfishness versus concern for others. This is the problem that our moral brains were designed to solve. The second kind of moral problem is distinctively modern. It’s Us versus Them: our interests and values versus theirs. This is the Tragedy of Commonsense Morality, illus­trated by this book ‘s first organizing metaphor, the Parable of the New Pastures. (Of course, Us versus Them is a very old problem. But histori­cally it’s been a tactical problem rather than a moral one.) This is the larger problem behind the moral controversies that divide us. In part 1, we’ll see how the moral machinery in our brains solves the first problem (chapter 2) and creates the second problem (chapter 3).

In part 2 (” Morality Fast and Slow”), we’ll dig deeper into the moral brain and introduce this book’s second organizing metaphor: The moral brain is like a dual-mode camera with both automatic settings (such as “portrait” or “landscape”) and a manual mode. Automatic settings are efficient but inflexible. Manual mode is flexible but inefficient. The moral brain’s automatic settings are the moral emotions we’ll meet in part 1, the gut-level instincts that enable cooperation within personal relationships and small groups. Manual mode, in contrast, is a general capacity for practical reasoning that can be used to solve moral problems, as well as other practical problems. In part 2, we’ll see how moral thinking is shaped by both emotion and reason (chapter 4) and how this “dual-process” morality reflects the general structure of the human mind (chapter 5).

In part 3, we’ll introduce our third and final organizing metaphor: Common Currency. Here we’ ll begin our search for a met amorality, a global moral philosophy that can adjudicate among competing tribal moralities, just as a tribe’ s morality adjudicates among the competing inter­ests of its members. A metamorality’s job is to make trad e-offs among competing tribal values, and making trade-off s requires a common cur­rency, a unified system for weighing values. In chapter 6, we’ll introduce a candidate metamorality, a solution to the Tragedy of Commonsense Morality . In chapter 7, we’ll consider other ways of establishing a common currency, and find them lacking. In chapter 8, we’ll take a closer look at the metamorality introduced in chapter 6, a philosophy known (rather unfortunately) as utilitarianism. We’ll see how utilitarianism is built out of values and reasoning processes that are universally accessible and, thus, how it gives us the common currency that we need.*

Over the years, philosophers have made some intuitively compelling arguments against utilitarianism. In part 4 (” Moral Convictions”), we’ll reconsider these arguments in light of our new understanding of moral cognition. We’ll see how utilitarianism becomes more attractive the better we understand our dual-process moral brains (chapters 9 and 10).

Finally, in part 5 (” Moral Solutions”), we return to the new pastures and the real-world moral problems that motivate this book. Having de­fended utilitarianism against its critics, it’s time to apply it-and to give it a better name. A more apt name for utilitarianism is deep pragmatism (chapter 11 ). Utilitarianism is pragmatic in the go o d and familiar sense: flexible, realistic, and open to compromise. But it’s also a deep philosophy , not just about expediency. Deep pragmatism is about making principled compromises. It’s about resolving our differences by appeal to shared values-common currency.

So, TL;DR, morality is an evolved mechanism to facilitate cooperation. It does this well, but not always. Typical moral disagreements are confused due to relying on rights-talk. Rights-talk is fundamentally useless even counter-productive to resolving conflicts. Utilitarianism (aka cost-benefit analysis in moral language) is the only game in town, so even if it is not technically true, it is still the most useful approach to moralizing.

The hereditarian hypothesis almost certainly true

And I’m not even referring to the latest results from any Pfferian methods (e.g. this).

Currently there are many large GWA studies for IQ/educational attainment/other g proxies that have data gathered from US citizens. Some of the subjects are African Americans (AA’s). AA’s are actually mixed ancestry with about 23% European ancestry. The simplest genetic model (without shenanigans from assortative mating or outbreeding) predicts a linear relationship between amount of Euro ancestry in AA’s and higher IQ/g proxy. This is easy to check, so easy that quite a lot of science bloggers could easily do this in one day if only they had the data available to them. The researchers who did the GWA studies surely are aware of this method.

Now, we have not seen any such study published, either positive or negative. Surely, there are huge benefits to being the first author to publish a study that almost conclusively disproves the genetic model for AA’s. Researchers with access to the data have a strong incitement to publish such a study. If the data supports the view, they have all the necessary means too. Since they haven’t published it, we can hence infer that the data does not support the politically favorable non-genetic conclusion, but instead the genetic model. Probably the authors with access do not want to go into history as the ones who finally proved ‘racism’. So, instead of settling the issue, they just sit on the data.

In any case, useful data for testing this are gonna leak sooner or later. It can’t take many years. An admixture study for diabetes and African ancestry came very close to proving the genetic model since they included socioeconomic (SES) measures as a control. Their S2 table shows a clear relationship between higher SES and less African ancestry. Of course, a stubborn person will regard this as simply being in line with a discrimination model being on visual cues.

Costs and benefits of publishing in legacy journals vs. new journals

I recently published a paper in Open Differential Psychology. After it was published, I decided to tell some colleagues about it so that they would not miss it because it is not published in any of the two primary journals in the field: Intell or PAID (Intelligence, Personal and Individual Differences). My email is this:

Dear colleagues,

I wish to inform you about my paper which has just been published in Open Differential Psychology.

Abstract
Many studies have examined the correlations between national IQs and various country-level indexes of well-being. The analyses have been unsystematic and not gathered in one single analysis or dataset. In this paper I gather a large sample of country-level indexes and show that there is a strong general socioeconomic factor (S factor) which is highly correlated (.86-.87) with national cognitive ability using either Lynn and Vanhanen’s dataset or Altinok’s. Furthermore, the method of correlated vectors shows that the correlations between variable loadings on the S factor and cognitive measurements are .99 in both datasets using both cognitive measurements, indicating that it is the S factor that drives the relationship with national cognitive measurements, not the remaining variance.

You can read the full paper at the journal website: openpsych.net/ODP/2014/09/the-international-general-socioeconomic-factor-factor-analyzing-international-rankings/

Regards,
Emil

One researcher responded with:

Dear Emil,
Thanks for your paper.
Why not publishing in standard well established well recognized journals listed in Scopus and Web of Science benefiting from review and
increasing your reputation after publishing there?
Go this way!
Best,
NAME

This concerns the decision of choosing where to publish. I discussed this in a blog post back in March before setting up OpenPsych. To be very short, the benefits of publishing in legacy journals is 1) recognition, 2) indexing in proprietary indexes (SCOPUS, WoS, etc.), 3) perhaps better peer review, 4) perhaps fancier appearance of the final paper. The first is very important if one is an up-and-coming researcher (like me) because one will need recognition from university people to get hired.

I nevertheless decided NOT to publish (much) in legacy journals. In fact, the reason I got into publishing studies so late is that I dislike the legacy journals in this field (and most other fields). Why dislike legacy journals? I made an overview here, but to sum it up: 1) Either not open access or extremely pricey, 2) no data sharing, 3) in-transparent peer review system, 4) very slow peer review (~200 days on average in case of Intell and PAID), 5) you’re supporting companies that add little value to science and charge insane amounts of money for it (for Elsevier, see e.g. Wikipedia, TechDirt has a large number of posts concerning that company alone).

As a person who strongly believes in open science (data, code, review, access), there is no way I can defend a decision to publish in Elsevier journals. Their practices are clearly antithetical to science. I also signed The Cost of Knowledge petition not to publish or review for them. Elsevier has a strong economic interest in keeping up their practices and I’m sure they will. The only way to change science for the better is to publish in other journals.

Non-Elsevier journals

Aside from Elsevier journals, one could publish in PLoS or Frontiers journals. They are open access, right? Yes, and that’s a good improvement. They however are also predatory because they charge exorbitant fees to publish: 1600 € (Frontiers), 1350 US$ (PLoS). One might as well publish in Elsevier as open access for which they charge 1800 US$.

So are there any open access journals without publication fees in this field? There is only one as far as I know, the newly established Journal of Intelligence. However, the journal site states that the lack of a publication fee is a temporary state of affairs, so there seems to be no reason to help them get established by publishing in their journal. After realizing this, I began work on starting a new journal. I knew that there was a lot of talent in the blogosphere with a similar mindset to me who could probably be convinced to review for and publish in the new journal.

Indexing

But what about indexing? Web of Science and SCOPUS are both proprietary; not freely available to anyone with an internet connection. But there is a fast-growing alternative: Google Scholar. Scholar is improving rapidly compared to the legacy indexers and is arguably already better since it indexes a host of grey literature sources that the legacy indexers don’t cover. A recent article compared Scholar to WOS. I quote:

Abstract Web of Science (WoS) and Google Scholar (GS) are prominent citation services with distinct indexing mechanisms. Comprehensive knowledge about the growth patterns of these two citation services is lacking. We analyzed the development of citation counts in WoS and GS for two classic articles and 56 articles from diverse research fields, making a distinction between retroactive growth (i.e., the relative difference between citation counts up to mid-2005 measured in mid-2005 and citation counts up to mid-2005 measured in April 2013) and actual growth (i.e., the relative difference between citation counts up to mid-2005 measured in April 2013 and citation counts up to April 2013 measured in April 2013). One of the classic articles was used for a citation-by-citation analysis. Results showed that GS has substantially grown in a retroactive manner (median of 170 % across articles), especially for articles that initially had low citations counts in GS as compared to WoS. Retroactive growth of WoS was small, with a median of 2 % across articles. Actual growth percentages were moderately higher for GS than for WoS (medians of 54 vs. 41 %). The citation-by-citation analysis showed that the percentage of citations being unique in WoS was lower for more recent citations (6.8 % for citations from 1995 and later vs. 41 % for citations from before 1995), whereas the opposite was noted for GS (57 vs. 33 %). It is concluded that, since its inception, GS has shown substantial expansion, and that the majority of recent works indexed in WoS are now also retrievable via GS. A discussion is provided on quantity versus quality of citations, threats for WoS, weaknesses of GS, and implications for literature research and research evaluation.

A second threat for WoS is that in the future, GS may cover all works covered by WoS. We found that for the period 1995–2013, 6.8 % of the citations to Garfield (1955) were unique in WoS, indicating that a very large share of works indexed in WoS is now also retrievable by GS. In line with this observation, based on an analysis of 29 systematic reviews in the medical domain, Gehanno et al. (2013) recently concluded that: ‘‘The coverage of GS for the studies included in the systematic reviews is 100 %. If the authors of the 29 systematic reviews had used only GS, no reference would have been missed’’. GS’s coverage of WoS could in principle become complete in which case WoS could become a subset of GS that could be selected via a GS option ‘‘Select WoS-indexed journals and conferences only’’. 2 Together with its full-text search and its searching of the grey literature, it is possible that GS becomes the primary literature source for meta-analyses and systematic reviews. [source]

In other words, Scholar covers almost all the articles that WoS covers already and is quickly catching up on the older studies too. In a few years Scholar will cover close to 100% of the articles in legacy indexers and they will be nearly obsolete.

Getting noticed

One thing related to the above is getting noticed by other researchers. Since many researchers read legacy journals, simply being published in them is likely sufficient to get some attention (and citations!). It is however not the only way. The internet has changed the situation here completely in that there are new lots of different ways to get noticed: 1) Twitter, 2) ResearchGate, 3) Facebook/Google+, 4) Reddit, 5) Google Scholar will inform you about new any research by anyone one has cited previously, 6) blogs (own or others’) and 7) emails to colleagues (as above).

Peer review

Peer review in OpenPsych is innovative in two ways: 1) it is forum-style instead of email-based which is better suited for communication between more than 2 persons, 2) it is openly visible which works against biased reviewing. Aside from this, it is also much faster, currently averaging 20 days in review.

Reputation and career

There is clearly a drawback here for publishing in OpenPsych journals compared with legacy journals. Any new journal is likely to be viewed as not serious by many researchers. Most people dislike changes including academics (perhaps especially?). Publishing there will not improve chances of getting hired as much as will publishing in primary journals. So one must weigh what is most important: science or career?

The g factor in autistic persons?

Check www.ncbi.nlm.nih.gov/pubmed/19572193

Eyeballing their figure seems to indicate that the g factor is much less strong in these children. A quick search on Scholar didn’t reveal any studies that investigated this idea.

If someone can obtain subtest data from autism samples, that would be useful. The methods I used in my recent paper (section 12) can estimate the strength of the general factor in a sample. If g is weaker in autistic samples, this should be reflected in these measures.

I will write to some authors to see if they will let me how the subtest data.

New paper out: The international general socioeconomic factor: Factor analyzing international rankings

openpsych.net/ODP/2014/09/the-international-general-socioeconomic-factor-factor-analyzing-international-rankings/

Abstract
Many studies have examined the correlations between national IQs and various country-level indexes of well-being. The analyses have been unsystematic and not gathered in one single analysis or dataset. In this paper I gather a large sample of country-level indexes and show that there is a strong general socioeconomic factor (S factor) which is highly correlated (.86-.87) with national cognitive ability using either Lynn and Vanhanen’s dataset or Altinok’s. Furthermore, the method of correlated vectors shows that the correlations between variable loadings on the S factor and cognitive measurements are .99 in both datasets using both cognitive measurements, indicating that it is the S factor that drives the relationship with national cognitive measurements, not the remaining variance.

This one took a while to do. Had to learn a lot of programming (R), do lots of analyses, 50 days in peer review. Perhaps my most important paper so far.

 

Comments on Learning Statistics with R

So I found a textbook for learning both elementary statistics much of which i knew but hadnt read a textbook about, and for learning R.

health.adelaide.edu.au/psychology/ccs/teaching/lsr/ book is free legally

www.goodreads.com/book/show/18142866-learning-statistics-with-r

Numbers refer to the page number in the book. The book is in an early version (“0.4″) so many of these are small errors i stumbled upon while going thru virtually all commands in the book in my own R window.

 

120:

These modeOf() and maxFreq() does not work. This is because the afl.finalists is a factor and they demand a vector. One can use as.vector() to make them work.

 

131:

Worth noting that summary() is the same as quartile() except that it also includes the mean.

 

151:

Actually, the output of describe() is not telling us the number of NA. It is only because the author assumes that there are 100 total cases that he can do 100-n and get the number of NAs for each var.

 

220:

The cakes.Rdata is already transposed.

 

240:

as.logical also converts numeric 0 and 1 to F and T. However, oddly, it does not understand “0” and “1”.

 

271:

Actually P(0) is not equivalent with impossible. See: en.wikipedia.org/wiki/Almost_surely

 

278:

Actually 100 simulations with N=20 will generally not result in a histogram like the above. Perhaps it is better to change the command to K=1000. And why not add hist() to it so it can be visually compared to the theoretic one?

 


>
hist(rbinom( n = 1000, size = 20, prob = 1/6 ))

298:

It would be nice if the code for making these simulations was shown.

 

299:

“This is just bizarre: σ ˆ 2 is and unbiased estimate of the population variance”

 

Typo.

 

327:

Typo in Figure 11.6 text. “Notice that when θ actually is equal to .05 (plotted as a black dot)”

 

344:

Typo.

“That is, what values of X2 would lead is to reject the null hypothesis.”

 

379:

It is most annoying that the author doesn’t write the code for reproducing his plots. I spent 15 minutes trying to find a function to create histplots by group.

 

385:

Typo.

 

“It works for t-tests, but it wouldn’t be meaningful for chi-square testsm F -tests or indeed for most of the tests I talk about in this book.”

 

391:

“we see that it is 95% certain that the true (population-wide) average improvement would lie between 0.95% and 1.86%.”

 

This wording is dangerous because there are two interpretations of the percent sign. In the relative sense, they are wrong. The author means absolute %’s.

 

400:

The code has +’s in it which means it cannot just be copied and runned. This usually isn’t the case, but it happens a few times in the book.

 

408+410:

In the description of the test, we are told to tick when the values are larger than. However, in the one sample version, the author ticks when the value is equal to. I guess this means that we tick when it is equal to or larger than.

 

442:

This command doesn’t work because the dataframe isn’t attached as the author assumes.

> mood.gain <- list( placebo, joyzepam, anxifree)

 

457:

First the author says he wants to use the R^2 non-adjusted, but then in the text he uses the adjusted value.

 

464:

Typo with “Unless” capitalized.

 

493:

“(3.45 for drug and 0.92 for therapy),”

He must mean .47 for therapy. .92 is the number for residuals.

 

497:

In the alternates hypothesis, the author uses “u_ij” instead of “u_rc” which is used in the null-hypothesis. I’m guessing the null-hypothesis is right.

 

514:

As earlier, it is ambiguous when the author talks about increases in percent. It could be relative or absolute. Again in this case it is absolute. The author should use %point or something to avoid confusion.

 

538:

Quoting

 

“I find it amusing to note that the default in R is Type I and the default in SPSS is Type III (with Helmert contrasts). Neither of these appeals to me all that much. Relatedly, I find it depressing that almost nobody in the psychological literature ever bothers to report which Type of tests they ran, much less the order of variables (for Type I) or the contrasts used (for Type III). Often they don’t report what software they used either. The only way I can ever make any sense of what people typically report is to try to guess from auxiliary cues which software they were using, and to assume that they never changed the default settings. Please don’t do this… now that you know about these issues, make sure you indicate what software you used, and if you’re reporting ANOVA results for unbalanced data, then specify what Type of tests you ran, specify order information if you’ve done Type I tests and specify contrasts if you’ve done Type III tests. Or, even better, do hypotheses tests that correspond to things you really care about, and then report those!”

 

An exmaple of the necessity of open methods along with open data. Science must be reproducible. The best is to simply share the exact source code to the the analyses in a paper.