31 - Undocumented

The data existed, but could not be re-used despite their best efforts.


When a young researcher started on his PhD, he was told to work on unpublished data that had been collected 3 years prior. He received various folders that were full of data. After going through them, he found that there were several datasheets with duplicate names but different contents, scripts that nobody knew what they did or why and column names that where unclear and ambiguous. Moreover, the exact equipment and/or settings used for the experiments where unknown in some cases. Since it had been several years, not even extensive talks with the manufacturers of the used equipment or the data authors could make the data usable. In the end, the data were not able to be re-used.

This shows how essential describing and documenting the data collection and analysis process really is for data re-use. Although it takes quite some time to document data, it takes even more time and frustration to try and figure out poorly documented data from years ago. Even though many people think they know their data, it is very likely that they will forget almost all details in a matter of a few years. Therefore, documentation should always be as concise, detailed, precise and easy to understand for third parties as possible.