Managing data is one of the most challenging areas of modern chemistry. For example, the synthesis of a new compound involves scientists to go through multiple rounds of experiments to find the right conditions for the reaction, and in the course generate massive amounts of raw data.
The data obtained has incredible value, as machine-learning algorithms can gain much from failed and partially successful experiments in the way humans do.
However, the current practice is to publish only most successful experiments, since no human can carry out meaningful processing of the massive number of failed experiments.
Importantly, AI has changed this; the technology can exactly do what machine-learning methods can do, provided the data is stored in a machine-usable format to be used by anyone.
In fact, the need to compress information has been long due to limited page count in printed journal articles.
In the present times, many journals do not have printed editions, however, reproducibility is still a challenge for chemists because journal articles have missing crucial details.
This compels researchers to waste time and resources to replicate failed experiments of authors, and struggle to engineer on top of published results as unprocessed raw data is seldom published.
In fact, here, data diversity is another problem besides volume. This has led research groups to use tools such as Electronic Lab Notebook software, which deposits data in proprietary arrangements that sometimes are incompatible with each other. The drawback of a standard technique makes it nearly impossible for research groups to share data.
This led a group of researchers to publish a perspective in Nature Chemistry that presents an open platform for the entire chemistry workflow: from inception to publication of a project.