One of the great benefits of my current job is that it gives me the excuse opportunity to read a number of interesting articles, in the name of research. I follow several journals and blogs; some because they are directly relevant to the work I do, some because they inspire a little creative thinking, and others just for my own gratification – although that can still win dinner and a movie in a radio trivia contest.
It was a conversation I had recently about research and publication that really tied this story together.
I have read several articles regarding research data, and in particular the problem of how to make that data available for further research and analysis. On one hand, broad access to research data would appear to be an obvious goal. More researchers working with the same data improves the ability to verify the quality and conclusions drawn from that data. Access to large pools of verified data makes it possible to identify new patterns, which might not be present in small data sets.
Yet, on the other hand, as the Cancer Biomedical Informatics Grid (caBIG) program demonstrated, it is not a simple matter to put that kind of infrastructure in place. The effort has since been assumed by the National Cancer Informatics Program (NCIP) – and upon review, only a subset of caBIG projects were recommended for transition to the new NCIP program.
Both programs were coordinated by the Center for Bioinformatics and Information Technology (CBIIT) and the decision to retire caBIG was the result of a working group study that reassessed what was needed to support the cancer research community vytorin generic. Since these efforts involved a single major sponsor, with a specific mission, what does this mean for the viability of a platform to support other disciplines?
Complicating matters is a study announced a couple of months ago about the ability to retrieve data referenced in published research papers. Even in an age of databases and electronic lab notebooks, some raw data becomes inaccessible within a year or two, and the situation becomes worse over time, as data is stored away along with so many other files – possibly preserved, but unable to be located.
The conversation that I mentioned above started with an anecdote of a researcher, unable to get in contact with the author of an article, and unable to validate the source data. That researcher spent weeks duplicating a process, just to create their own tables.
After discussing the philosophical problem of providing access to a researcher’s data, and who should control that access – including raw data considered as a work product, and whether a researcher can be required to release it – a question was asked concerning access to a researcher’s publications, which are also valuable for further research and analysis.
I explained how any expression of authorship falls under copyright, and how that includes charts or graphs that demonstrate analysis. The researcher generally assigns their copyright to the publisher, which in turn provides the infrastructure that validates the quality, and mechanisms for access and distribution to the research community – supported by paid subscriptions.
That business model requires access to be restricted. If researchers could obtain a publication without paying the publisher, there would be no incentive for the publisher to maintain the infrastructure. As a result, a conflict develops between the researcher and the publisher, where the researcher is restricted in the ability to disseminate their own work.
Some of the articles that I want to review are behind paywalls. While it may be frustrating, it is also an opportunity to consider the quality of the peer review or editorial practices. I also remember being charged for copies of journal articles when I was working on my thesis. Not to date myself, but those articles remain available today.
So should we consider a similar business model for research data, and if so, what would be the price?