Getting credit for creating data

With Nature now taking submissions to its new data journal, Scientific Data, what does this mean for research publishing?

Well, data journals have been around for a while. One of the first was Copernicus’ Earth System Science Data which started in 2008. More recently I was involved in a JISC project (OJIMS) to look at data publishing for the Royal Meteorological Society and this ultimately led to its Geoscience Data Journal.  Now there are quite few, but none specifically for the research areas of built environment and infrastructure.

One ARCC project researcher in Newcastle has told me he has a conventional paper under review where he will cite the data on which his analysis is based. Are there others out there? In this case the citation will include a DOI, linking to the dataset. I hope, that as other early adopters of data citation have found, providing a link to the data will increase citations of his article. And there’s also the fundamental benefit of it being good research practice to make data available so that scientific analysis is reproducible and verifiable.

However that’s data citation. The purpose of Scientific Data, and other data journals, is to publish the data itself, giving credit for the creation of the data (without any analysis). A publication in a data journal consists of a description of the data and then access to it via a data repository.  I haven’t found any repositories specifically for the ARCC areas of research, but the Bodleian Libraries have listed several that take a variety of engineering and other data.

Publishing in data journals could be one way of meeting EPSRC’s data policy expectations. It is expecting all the institutions it funds to publish online, accessible metadata describing their research data holdings, normally within 12 months of the data being generated. It also expects that access to publicly-funded data should not be withheld unless there are exceptional circumstances. Embargo periods are recognised as a reasonable way to let data creators work privately on data before wider access is opened up. This would normally give the creators an unassailable lead in any race to publish analysis of the data.

I take Nature’s launch of a data journal as a sign that this relatively young form of research publication is maturing. Its scope is all data of scientific value. Earth system science, chemistry and medicine have other channels for publishing data, but engineering has not until now as far as I am aware. It will be interesting to see how researchers respond.

