Open data is a fundamental part of getting science to work well. Primary reasons for this:

  • Redundancy is data archiving. Most data are lost because no backups exist!
  • Easy access to 3rd parties. For new analyses or error checking previous work. Scientists are human and often refuse access to data for hostile outsiders, preventing them from error checking their own work.
Unfortunately, there are only a few behavioral datasets in existence owing to not generally collecting datasets for multiple family members at a time. Some of the public or partially public ones are:
  • NLSYs (National Longitudinal Surveys of Youth) as described in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5604233/
  • NCPP (National Collaborative Perinatal Project) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2646177/ The data files are really annoying to work with (fixed width format), but some people have released 3rd party versions that are easier https://data.nber.org/cpp/
  • TEDS (Twins Early Development Study) is closed https://www.teds.ac.uk/researchers/teds-data-access-policy
    • But, part of it used to be partially public at this https://www.teds.ac.uk/research/collaborators-and-data/public-datasets but removed now, I have put a copy here https://osf.io/nmzp9/
    • Update: it is now moved to here and still available http://www.teds.ac.uk/researchers/publication-resources
  • PT (Project Talent), but so far not released I think https://www.projecttalent.org/new-studies/project-talent-twins-siblings-study/
  • More? Contact me