There’s a few studies on this already. This is a rare event/person situation, so sampling error is large and problematic (cf. discussion in esports paper). Ignoring this, we can draw some general conclusions by a quick literature survey.
Oransky, I. (2018). Volunteer watchdogs pushed a small country up the rankings. Science.
Based on data from RetractionWatch database. This is somewhat wrong approach because a lot of fraud committed in Western countries is done by foreign researchers. One needs to do the analyses by name and inferring the ancestry of the persons by building a machine learning model off data from e.g. behindthename.com (see my prior study on first names). As I noted on Twitter, in Denmark, most famous faker is Milena Penkowa. That doesn’t sound very Danish, she is actually half Bulgarian.
If we look at the top list of fraudsters in RW database:
|Name||Retractions||Ancestry||European||East Asian||South Asian||Other|
|Chen-Yuan Peter Chen||43||Chinese||1|
|Jan Hendrik Schön||32||German||1|
|A Salar Elahi||27||Persian||1|
|Richard L E Barnett||26||European||1|
|Prashant K Sharma||26||Indian||1|
|Thomas M Rosica||23||European||1|
|Anil K Jaiswal||22||Indian||1|
I manually coded these by googling them, and if not helpful, then relying on names. I don’t know any of ancestry/name based count of scientific productivity, but if we use the Nature index that Anatoly Karlin wrote about recently, and do a crude count (meaning I allocate Euro-dominant countries entirely to European, including Brazil and Israel):
|European||East Asian||South Asian|
Europeans produce ~72% of ‘good science’ in 2012-2018, East Asians ~23% and South Asians ~2%, and there is a remainder category of ~4%. Relative to the top list, Europeans produce more science than top science cheaters, and vice versa for the other groups.
Ataie-Ashtiani, B. (2018). World map of scientific misconduct. Science and engineering ethics, 24(5), 1653-1656.
A comparative world map of scientific misconduct reveals that countries with the most rapid growth in scientific publications also have the highest retraction rate. To avoid polluting the scientific record further, these nations must urgently commit to enforcing research integrity among their academic communities.
They give us a nice table, and I did the same kind of crude calculations (grouping Latin Americans as Europeans, under assumptions people who do science there are European ancestry mostly):
|Country||Publications||Misconducts||Ratio||Ratio of European|
|Region||Publications||Misconducts||Ratio||Ratio of European|
A staggering ratio for East Asians.
Fanelli, D., Costas, R., Fang, F. C., Casadevall, A., & Bik, E. M. (2019). Testing hypotheses on risk factors for scientific misconduct via matched-control analysis of papers containing problematic image duplications. Science and engineering ethics, 25(3), 771-789.
It is commonly hypothesized that scientists are more likely to engage in data falsification and fabrication when they are subject to pressures to publish, when they are not restrained by forms of social control, when they work in countries lacking policies to tackle scientific misconduct, and when they are male. Evidence to test these hypotheses, however, is inconclusive due to the difficulties of obtaining unbiased data. Here we report a pre-registered test of these four hypotheses, conducted on papers that were identified in a previous study as containing problematic image duplications through a systematic screening of the journal PLoS ONE. Image duplications were classified into three categories based on their complexity, with category 1 being most likely to reflect unintentional error and category 3 being most likely to reflect intentional fabrication. We tested multiple parameters connected to the hypotheses above with a matched-control paradigm, by collecting two controls for each paper containing duplications. Category 1 duplications were mostly not associated with any of the parameters tested, as was predicted based on the assumption that these duplications were mostly not due to misconduct. Categories 2 and 3, however, exhibited numerous statistically significant associations. Results of univariable and multivariable analyses support the hypotheses that academic culture, peer control, cash-based publication incentives and national misconduct policies might affect scientific integrity. No clear support was found for the “pressures to publish” hypothesis. Female authors were found to be equally likely to publish duplicated images compared to males. Country-level parameters generally exhibited stronger effects than individual-level parameters, because developing countries were significantly more likely to produce problematic image duplications. This suggests that promoting good research practices in all countries should be a priority for the international research integrity agenda.
Which produced this (low quality) figure:
Data are pretty noisy, but of the ones with p<.05, we see Germany and Japan below USA, Argentina, Belgium, India, China and other above. Don’t know what is up with Belgium here, but otherwise, the results aren’t so surprising.
All in all, Hajnal pattern applies. Everybody cheats, but people outside Hajnal cheat a lot more. There are more data out there, a lot of it private. A friend of mine works with foreign applications for people who want to study in the UK (scholarships). People send in essays and the like English test scores (TOEFL etc.). The agencies screen the essays for plagiarism, so one gets a per capita rate of plagiarism. If screened OK, they are given an interview in English. Often, people with perfect essays can’t seem to talk English very well, which is generally because they hired someone to write their essays for them, which is not detectable by plagiarism testing but results in massive discrepancies between English written and spoken ability. One can also look up various cheating scandals, as summarized by Those Who Can See.
Thai heist movie, except the goal is to cheat in tests, not steal money from banks. www.imdb.com/title/tt6788942/%5B/caption%5D