Goodreads. Libgen.

This book is background material for CGPGrey’s great short film:

So, if you saw that and are more curious, perhaps this book is for you. If the film above is not interesting to you, the book will be useless. Generally, the film conveys the topic better than the book, but the book of course contains more information.

The main flaw of the book is that the authors speculate on various economic and educational changes but without knowing about differential psychology and behavior genetics. For instance, they note that the median income has been falling. They don’t seem to consider that this may be in part due to a changing population in the US (relatively fewer Europeans, more Hispanics). Another example is that they look at the average income of people with only high-school over time and compare them with those with college. They don’t realize that due to the increased uptake of college education, the mean GMA of people with only high school has been falling steadily. So it does not necessarily have anything to do with educational attainment as they think.

The most interesting section was this:

What This Problem Needs Are More Eyeballs and Bigger Computers

If this response is at least somewhat accurate—if it captures something about how innovation and economic growth work in the real world—then the best way to accelerate progress is to increase our capacity to test out new combinations of ideas. One excellent way to do this is to involve more people in this testing process, and digital technologies are making it possible for ever more people to participate. We’re interlinked by global ICT [Information and Communication Technology], and we have affordable access to masses of data and vast computing power. Today’s digital environment, in short, is a playground for large-scale recombination. The open source software advocate Eric Raymond has an optimistic observation: “Given enough eyeballs, all bugs are shallow.”20 The innovation equivalent to this might be, “With more eyeballs, more powerful combinations will be found.”

NASA experienced this effect as it was trying to improve its ability to forecast solar flares, or eruptions on the sun’s surface. Accuracy and plenty of advance warning are both important here, since solar particle events (or SPEs, as flares are properly known) can bring harmful levels of radiation to unshielded gear and people in space. Despite thirty-five years of research and data on SPEs, however, NASA acknowledged that it had “no method available to predict the onset, intensity or duration of a solar particle event.”21

The agency eventually posted its data and a description of the challenge of predicting SPEs on Innocentive, an online clearinghouse for scientific problems. Innocentive is ‘non-credentialist’; people don’t have to be PhDs or work in labs in order to browse the problems, download data, or upload a solution. Anyone can work on problems from any discipline; physicists, for example, are not excluded from digging in on biology problems.

As it turned out, the person with the insight and expertise needed to improve SPE prediction was not part of any recognizable astrophysics community. He was Bruce Cragin, a retired radio frequency engineer living in a small town in New Hampshire. Cragin said that, “Though I hadn’t worked in the area of solar physics as such, I had thought a lot about the theory of magnetic reconnection.”22This was evidently the right theory for the job, because Cragin’s approach enabled prediction of SPEs eight hours in advance with 85 percent accuracy, and twenty-four hours in advance with 75 percent accuracy. His recombination of theory and data earned him a thirty-thousand-dollar reward from the space agency.

In recent years, many organizations have adopted NASA’s strategy of using technology to open up their innovation challenges and opportunities to more eyeballs. This phenomenon goes by several names, including ‘open innovation’ and ‘crowdsourcing,’ and it can be remarkably effective. The innovation scholars Lars Bo Jeppesen and Karim Lakhani studied 166 scientific problems posted to Innocentive, all of which had stumped their home organizations. They found that the crowd assembled around Innocentive was able to solve forty-nine of them, for a success rate of nearly 30 percent. They also found that people whose expertise was far away from the apparent domain of the problem were more likely to submit winning solutions. In other words, it seemed to actually help a solver to be ‘marginal’—to have education, training, and experience that were not obviously relevant for the problem. Jeppesen and Lakhani provide vivid examples of this:

[There were] different winning solutions to the same scientific challenge of identifying a food-grade polymer delivery system by an aerospace physicist, a small agribusiness owner, a transdermal drug delivery specialist, and an industrial scientist. . . . All four submissions successfully achieved the required challenge objectives with differing scientific mechanisms. . . .

[Another case involved] an R&D lab that, even after consulting with internal and external specialists, did not understand the toxicological significance of a particular pathology that had been observed in an ongoing research program. . . . It was eventually solved, using methods common in her field, by a scientist with a Ph.D. in protein crystallography who would not normally be exposed to toxicology problems or solve such problems on a routine basis.23

Like Innocentive, the online startup Kaggle also assembles a diverse, non-credentialist group of people from around the world to work on tough problems submitted by organizations. Instead of scientific challenges, Kaggle specializes in data-intensive ones where the goal is to arrive at a better prediction than the submitting organization’s starting baseline prediction. Here again, the results are striking in a couple of ways. For one thing, improvements over the baseline are usually substantial. In one case, Allstate submitted a dataset of vehicle characteristics and asked the Kaggle community to predict which of them would have later personal liability claims filed against them.24 The contest lasted approximately three months and drew in more than one hundred contestants. The winning prediction was more than 270 percent better than the insurance company’s baseline.

Another interesting fact is that the majority of Kaggle contests are won by people who are marginal to the domain of the challenge—who, for example, made the best prediction about hospital readmission rates despite having no experience in health care—and so would not have been consulted as part of any traditional search for solutions. In many cases, these demonstrably capable and successful data scientists acquired their expertise in new and decidedly digital ways.

Between February and September of 2012 Kaggle hosted two competitions about computer grading of student essays, which were sponsored by the Hewlett Foundation.* Kaggle and Hewlett worked with multiple education experts to set up the competitions, and as they were preparing to launch many of these people were worried. The first contest was to consist of two rounds. Eleven established educational testing companies would compete against one another in the first round, with members of Kaggle’s community of data scientists invited to join in, individually or in teams, in the second. The experts were worried that the Kaggle crowd would simply not be competitive in the second round. After all, each of the testing companies had been working on automatic grading for some time and had devoted substantial resources to the problem. Their hundreds of person-years of accumulated experience and expertise seemed like an insurmountable advantage over a bunch of novices.

They needn’t have worried. Many of the ‘novices’ drawn to the challenge outperformed all of the testing companies in the essay competition. The surprises continued when Kaggle investigated who the top performers were. In both competitions, none of the top three finishers had any previous significant experience with either essay grading or natural language processing. And in the second competition, none of the top three finishers had any formal training in artificial intelligence beyond a free online course offered by Stanford AI faculty and open to anyone in the world who wanted to take it. People all over the world did, and evidently they learned a lot. The top three individual finishers were from, respectively, the United States, Slovenia, and Singapore.

Quirky, another Web-based startup, enlists people to participate in both phases of Weitzman’s recombinant innovation—first generating new ideas, then filtering them. It does this by harnessing the power of many eyeballs not only to come up with innovations but also to filter them and get them ready for market. Quirky seeks ideas for new consumer products from its crowd, and also relies on the crowd to vote on submissions, conduct research, suggest improvements, figure out how to name and brand the products, and drive sales. Quirky itself makes the final decisions about which products to launch and handles engineering, manufacturing, and distribution. It keeps 70 percent of all revenue made through its website and distributes the remaining 30 percent to all crowd members involved in the development effort; of this 30 percent, the person submitting the original idea gets 42 percent, those who help with pricing share 10 percent, those who contribute to naming share 5 percent, and so on. By the fall of 2012, Quirky had raised over $90 million in venture capital financing and had agreements to sell its products at several major retailers, including Target and Bed Bath & Beyond. One of its most successful products, a flexible electrical power strip called Pivot Power, sold more than 373 thousand units in less than two years and earned the crowd responsible for its development over $400,000.

I take this to mean that: 1) polymathy/interdisciplinarity is not dead or dying at all, it is in fact very useful, 2) to make oneself very useful, one should focus on learning a bunch of unrelated methods for analyzing data, and when studying a field, one should attempt to use methods not commonly used in that field, 3) work that is related to AI, machine learning etc. is the future (until we are completely unable to compete with computers).