The Gift of Data: Medical Journals Back Public Data Disclosure as Prerequisite for Publishing


If you’ve ever struggled to learn a second language you’ll have experienced the tiny thrill that comes from connecting a word you don’t know with a definition buried deep in your mother tongue. I’ve encountered this regularly in my on-again, off-again efforts to master French, most recently with the very patient and good-humored Philippe Généreux, MD, of Hôpital du Sacré-Coeur de Montréal (Montréal, Canada). He was telling me his picks for the biggest news of 2015, in French, and in the course of our mostly one-sided conversation he referred several times—confusingly—to les données. Literal translation: “the given.”   


It was several sentences before my galloping brain caught up, pleased as punch, realizing “given” in French is also “data.” 

That moment of recognition came back to me this week with the news that the International Committee of Medical Journal Editors  (ICMJE) had announced a draft proposal of plans to require clinical trialists to share their data publically as a prerequisite for publishing trial results in their member journals. In other words: raw data, freely given.  

Announcement of the proposal was published simultaneously in 14 medical journals internationally including the Annals of Internal Medicine, BMJ, JAMA, Lancet, and New England Journal of Medicine. Darren Taichman, MD, executive deputy editor of the Annals of Internal Medicine, is the lead author.

Under the proposal, de-identified patient-level data would be made public according to an agreed upon plan within 6 months of the publication each trial’s main results. The plan for data-sharing would be a component of clinical trial registration. ClinicalTrials.gov has already added an element to its registration platform to collect data-sharing plans, Taichman et al note.  

A long-time proponent of “open science” and data sharing is Joseph Ross, MD, of Yale University School of Medicine (New Haven, CT). The ICMJE’s announcement this week, he told me, “has tremendous significance.” He pointed out to TCTMD that it wasn’t until 2005, when the ICMJE threw its weight behind clinical trial registration—actually introduced by the FDA Modernization Act back in 1997—that trialists and sponsors actually started doing it consistently. “It was at that time that clinical trial registration really took off and became an effective mechanism for understanding the research base and improving the overall integrity of the published literature,” Ross said.

Enter phase 2. A number of different entities have already started data sharing. The National Heart, Lung, and Blood Institute was an early leader in this space, Ross noted, while several academic institutions, international trialists, and industry-lead initiatives have required public-access trial data. The ICMJE’s involvement, however, turns the reward into the carrot: in order to be published in one of the world’s most esteemed scientific publications, you must commit, in advance, to turning over your data once the study is in print.

The Gift That Keeps on Giving

As a journalist covering cardiology and medicine, I’ve regularly heard complaints about data access, both from researchers demanding access to other investigators’ raw numbers, and from researchers themselves, denied full access to data by the sponsors paying for the studies.

Some of the biggest controversies in cardiology over the last 15 years have stemmed from data-access debacles: the rofecoxib (Vioxx; Merck) scandal in the early 2000s, the risk of MI with rosiglitazone (Avandia; GlaxoSmithKline) in RECORD, and more recently, a barrage of complaints over the PLATO trial of ticagrelor (Brilinta; AstraZeneca), led by non-trial researchers demanding access to raw trial data. All of these have been fun stories for reporters to sink their teeth into, but as the ICMJE points out, are a gross disservice to the patients who put themselves at risk by participating in the studies.

Ross, however, pointed out that the ICMJE announcement is less about scrutinizing published trial results and more about leveraging data “for other research questions and other scientific inquiries” that the original investigators might never have contemplated.

“In the world today, we’re increasingly seeing ways that people can use data that no one had ever thought of before, but in clinical research, if you are responsible for a clinical trial and you collect the data then you, up until now, have authority and no one else can even look at it the data. In other words, it’s ‘your’ data,” he observed. “This is in fact turning that around and saying science should be an open process [and] there should be many other individuals taking advantage of it. That further honors the participants and volunteerism of the patients who signed up.” 

Open Questions for Open Science

This is unlikely to happen seamlessly.  Ross pointed to 2 questions yet to be answered: what is the mechanism by which data will be shared, and how will research initiatives of this type get funded?

“In the past, the entire research infrastructure was about going and collecting your own data,” he said. “Now we’ll need to be thoughtful and smart about supporting investigators who can make use of already collected data and advance science without the much greater cost of going out and collecting their own data.”

Ross predicts that there will also be some pushback, not from government funding agencies or industry, but from academic institutions that, with some exceptions, have been slow to embrace open science.

“What we’ve seen in the data-sharing world in the past 3 to 4 years has been remarkable, in part driven by the forward-looking policies of [the pharmaceutical industry trade groups] PhRMA and EFPIA, who said: our member companies need to have data-sharing plans in place,” Ross said. His own institution’s Yale Open Data Access project (YODA) announced a partnership with Johnson & Johnson in 2014, while GlaxoSmithKline announced an open science partnership with the Francis Crick Institute in London last summer.

“So I think [industry sponsors] are going to be ahead of the curve and for the most part are going to be supportive,” Ross said. “But academic investigators who have never had to be in the position of sharing data this way, I think it’s going to be a little bit thornier as they figure it out.”

Pascal Meier, MD, of University College London in the United Kingdom, is the editor at Open Heart, one of the first journals to emphasize “open data” by offering authors a 25% discount on publishing fees if they agree to also release their data. Meier told me on the one hand, as an editor, he’s supportive of the ICMJE proposal and believes it could even go one step further, since in its current form the public data requirement applies only to interventional studies—not retrospective analyses and observational studies. Data repositories for those studies could help third parties check studies for accuracy and bias, he said. 

As an author, however, Meier anticipates problems, pointing out that most researchers have no experience with public data and research projects are not typically funded with money earmarked for cleaning and organizing data in a way that would be comprehensible to others. “You can’t just put an Excel spreadsheet online” and expect outside researchers to know how to use it, he said.

Gregg W. Stone, MD, of Columbia University Medical Center (New York, NY), expressed similar concerns. “Everybody is for transparency,” he said. “These are important datasets that affect patient care, especially those pivotal trials that lead to regulatory approval of new drugs and devices.” That said, Stone continued, sponsors and investigators have typically spent years on these studies “know the datasets intimately, and there are usually a lot of sophisticated considerations and nuances that go into analyzing the data. To just put a dataset out there without having that background  ... is fraught with problems and may lead to inaccurate analyses and interpretations.”

Likewise, Christopher Cannon, MD, of the Harvard Clinical Research Institute (Boston, MA), told me in an email that while the idea of public data “has merit... practical considerations make it very difficult and potentially damaging to science.”

Cannon, Stone, and Meier all noted that most investigators will have the intention of publishing multiple analyses, so making data public within 6 months leaves them little time to prepare, present, and publish those analyses. Moreover, Cannon predicted, “there would almost definitely be duplicate analyses and dueling publications on the same topic and the same database,” potentially undertaken by groups who do not have the same detailed knowledge of complicated datasets enjoyed by the original trial investigators.

Cannon also raised concerns about the rights of the sponsor that “has paid often hundreds of millions of dollars to do large studies.” To hand over data to “anyone who asks” would be to “hand over all of that investment,” he argued.

Stone suggested a problem where individuals with an axe to grind might “come out of the woodwork” with “ill will,” conducting subset analyses that make for sensational headlines “but which in the grand scheme of things mean nothing. ... I’m not at all opposed to credible investigators and research grounds trying to advance medical science having access to the data, but my proposal would be to allow 1 year before the high-level dataset is public and 2 years for the complete dataset,” to allow investigators enough time to publish using their data.

Ross, for his part, acknowledged that the issue of how original investigators and sponsors could be credited for the initial hard slog of collecting the data will need to be addressed. “Just as we have citation indexes, we will have to figure out a data-use index or something of that sort that tracks how often people's data are used,” he suggested. 

Watching Over the Wide Open

Finally, there are enforcement questions. I’d like to know who will be keeping tabs on whether data gets released within the 6 months proposed by the ICMJE. Will journal editors themselves, already deluged, take on the onerous task of checking that all of the raw data for the studies they’ve published have been released as promised?

“That’s a great question,” Ross mused. “You would hope that the individual investigators are going to have sufficient professionalism that they’re going to do it as part of their professional responsibility, and to be in compliance with the ICMJE.”

Meier raised the same concerns, noting that journals and editors are not currently staffed to undertake such a task. “In an ideal world, open data is a good thing, but it’s a very difficult thing to enforce in the real world,” he said.

Perhaps the threat of a retraction or the specter of being barred from future publication will be incentive in and of itself. “How many investigators are going to jeopardize their future publication opportunities in the JAMAs and New England Journals of the world?” Ross asked. “Hopefully there will be some self-regulation and professionals will realize the value of this for the greater field.”

Cannon brought up a different kind of oversight that will also fall to the journals. “Different results may emerge from misconceptions of what data should be used for analysis of specific topics,” he said. “Have the ICJME editors considered what they would do if they receive 2 papers on the same topic at the same journal? Or at other partners ICJME journals? Would they oversee and prevent any duplicate publication?” Most important, he added, if there are differences in results, who will be the arbiter in deciding which analysis is the best, most accurate reflection of the data?

Those types of questions will inevitably need to be considered by the ICMJE in the coming months. The committee has asked for feedback on its proposal before April 18, 2016. Requirements to commit to public data would go into effect for trials that begin to enroll patients 1 year after the data-sharing requirements are officially adopted.

We Recommend

Comments