The vast majority of scientific data are evaporating, never to be heard from again.
The reason, based an analysis published in the journal Current Biology, is almost boring - old email addresses and obsolete storage devices are making off with hard-earned research, according to Tim Vines, a visiting scholar at the University of British Columbia, and his colleagues.
In all, Vines and his team estimated that some 80 percent of data are lost within 20 years of the publication of their accompanying study.
"Publicly funded science generates an extraordinary amount of data each year," Vines said. "Much of these data are unique to a time and place, and is thus irreplaceable, and many other datasets are expensive to regenerate."
Today's system of leaving data with authors is simply not working, the researcher said, noting that nearly all of the information is going missing. This is problematic if anyone down the line hopes to validate the work done with it or use it for other purposes.
"I don't think anybody expects to easily obtain data from a 50-year-old paper, but to find that almost all the datasets are gone at 20 years was a bit of a surprise."
The group came to their conclusion after attempting to collect original research data from a randomly selected group of more than 500 studies published between 1991 and 2011. Their efforts yielded very little, however, with the odds of obtaining underlying data dropping by 17 percent each year starting two years after the accompanying study was published.
To counteract this trend, Vines said scientific journals should require authors to upload their data onto public archives as a condition for publication. Papers with data that are accessible to the public are more valuable for society, he argued, and thus should receive priority for publication.
"Losing data is a waste of research funds and it limits how we can do science," Vines said. "Concerted action is needed to ensure it is saved for future research."