Samantha Andrews

Marine biologist/ecologist and a science and environmental writer. She can be found talking or writing about our Earth in all its splendour—including the people and other animals who live here —and achieving a more sustainable future.

Sharing is caring: data papers give science a boost

June 8, 2022 | 4 minute read

With the open science movement now sweeping across disciplines, researchers are increasingly creating findable, accessible, interoperable, and reusable (FAIR) datasets.

One way data are made discoverable for reuse is with data papers—a thorough description of a dataset(s) deposited in a publicly accessible repository.

Unlike traditional peer-reviewed papers, which focus on original research, data papers undertake no analysis or interpretation. “Data papers are essentially a means of taking detailed stock, or inventory, of data,” explain Drs. Lisa Loseto and Melissa Lafrenière, Editors-in-Chief at Arctic Science, an open access peer-reviewed journal that accepts data papers.

What can data papers do for science?

1. Data papers support scientific discovery

Opportunities to build on existing data multiply “[a]nytime people decide to archive their data and take the time to explain what the data [are],” says Dr. Janet Prevéy, an ecologist with the U.S. Geological Survey and lead author of Arctic Science‘s inaugural data paper The tundra phenology database: more than two decades of tundra phenology responses to climate change.

The paper describes plant phenology data collected under the International Tundra Experiment (ITEX). At the time of the paper’s publication, the database contained more than 150,000 observations made by multiple different researchers dating back to 1992, and covered some 278 plant species.

Per year, the total number of phenology observations of each phenophase type across all study areas in the tundra phenology database | Learn more

The phenology database is housed on the online and open-access Polar Data Catalogue. This open nature is a critical component of a data paper. Data collection is a monumental effort. And when the dataset itself is misplaced or not accessible, and consequently underused, “it is a huge loss to science” notes Dr. Heather Lynch, professor of ecology and evolution at Stony Brook University.

Lynch has produced several data papers over the years, including a description of the Mapping Application for Penguin Populations and Projected Dynamics (MAPPPD), an online database which builds on another of Lynch’s data papers describing survey data on breeding birds in the Antarctic.

2. Data papers are collaborative initiatives

“Data papers are most useful when you are in a situation where you have multiple different datasets from a large number of researchers and groups, like with the ITEX network,” says Dr. Ingibjörg Svala Jónsdóttir, professor of ecology at the University of Iceland, member of the Arctic Science editorial board, and co-author of the phenology data paper.

Creating the database requires substantial work. “Everyone has different hypotheses and experimental setups, so people collect data slightly differently at all of their sites,” Prevéy explains. Prevéy and the team had to collect, validate, clean, and organize the data in the same format so observations could be compared and used together. “This was one of those really successful collaborative efforts.”

NEW | How to write a data availability statement and cite datasets

3. Data papers spark collaborations

Data papers make it easier to contact researchers when you have a question about their work. “The data owners, data paper creators, and data users can be easily found and contacted,” by way of a data paper, explains Jónsdóttir. “This is important, even though the databases described in the data papers are open access.”

The benefits are twofold. First, users who need additional information can easily track down the relevant people. “No amount of documentation replaces a real live human being willing to field questions about a dataset,” says Lynch. Second, it can generate collaboration between scientists. Jónsdóttir, for example, has been invited to be a co-author on a new paper using the tundra phenology database.

4. Data papers make citing data straightforward

One benefit data papers offer is that their associated data can be easily cited.

“We knew that people would want some way to cite the database/website, so we wrote a data paper describing the database, and this is what people now cite if they want to reference the MAPPPD database,” Lynch explains.

Prevéy also notes that data paper citations could help support funding applications because they indicate the broader importance of data collection, help researchers discover what questions have already been explored, and see who else is asking similar or complementary questions.

The plant phenology database includes observations from open top chambers (photo by Zoe Panchen) | Learn more

What makes a good data paper?

Good data papers have several key components: summary of the project(s) and publications associated with the data, description of data collection and the dataset itself (e.g., time or geographic scales), and any other details needed for researchers to reproduce the research methods and reuse the data (e.g., code, software).

“There has to be enough information there for someone to sensibly re-use the data,” Lynch says. “It’s important that the paper describes the data and guides those who would like to use it on the possibilities and limitations,” Jónsdóttir explains.

Jónsdóttir argues that data papers need to go beyond overly simplistic descriptions. “When you are writing your paper on your own data, you understand the context of the data,” says Jónsdóttir. “It’s very important that a data paper gives a good description of the context, so others can interpret the data correctly and understand how the context surrounding the data can affect their results.”

Who uses data papers?

The science community is arguably the primary beneficiary of data papers and their associated open access databases; however, freely accessible datasets and their descriptions can be useful for anyone wanting to make an evidence-informed decision. And in the context of community-engaged research, data papers and datasets that are freely accessible and accessibly written can support equitable data sharing with community partners.

In the context of research in the Arctic, for example, “data papers are a way to address our responsibility to the Indigenous Peoples of the Arctic by making the data collected in their homeland available and accessible to them,” Loseto and Lafrenière explain.

Samantha Andrews

Marine biologist/ecologist and a science and environmental writer. She can be found talking or writing about our Earth in all its splendour—including the people and other animals who live here —and achieving a more sustainable future.