by Nancy K. Herther
In a 2016 article, Daniel Saraga of the Swiss National Science Foundation explained the goal of Open Science as “to break the shackles that fetter the individual elements of the scientific production process – from the formation of hypotheses to the publication of results . The watchwords are: sharing and inclusion, collaboration and decentralization, and transparency. By fully opening research work, it can be made useful to everyone: to scientists, industry, and members of civil society. Even computer programs will be able to draw new conclusions from old results.”
Historians have defined the emergence of the concept of Open Science to “the late sixteenth and early seventeenth centuries [when] the idea and practice of ‘open science’ represented a break from the previously dominant ethos of secrecy in the pursuit of ‘Nature’s secrets’. It was a distinctive and vital organizational aspect of the scientific revolution, from which crystallized a new set of norms, incentives and organizational structures that reinforced scientific researchers’ commitments to rapid disclosure of new knowledge.” With the publication of research results and theories in journals, ‘ordinary’ people as well as other scientists were able to follow changes and discoveries as they were reported. Making this knowledge publicly available allowed for the sharing of knowledge, more collective work on specific topics and sharing with the larger communities (of the public, funders, practitioners, etc.).
Of course this was before copyright and the high competitive value of this knowledge. “The movement for Open Science is part of this framework of tension between new forms of collaborative, interactive and shared production of information, knowledge and culture on the one hand and, on the other, the mechanisms of capture and privatization of this knowledge that is collectively and socially produced,” notes Brazilian writers noted in 2015. Entire powerful industries – from drug companies to commercial publishers and more – arose to acquire or create this knowledge and reports for profit. The upside of this is the development of regulated, efficient systems of production and dissemination in a for-profit environment.
As time went on, the connections between the private sector commercial publishers and research institutions and their libraries became strained. Today, the concept of Open Science, along with Open Access and Open Research are becoming the mantra of higher education.
OPEN SCIENCE – TURNING SCIENCE ON ITS HEAD
The Center for Open Science sees its mission to “drive change in the culture and incentives that driveresearchers’ behavior, the infrastructure that supports their research, and the business models that dominate scholarly communication.” Professional organizations, higher education institutions and major public and private funding agencies have joined this movement. “The movement gathered force in the life sciences in the 1990s with the Human Genome Project, and spread to protein structures and then early-stage drug discovery through the Structural Genomics Consortium (SGC),” notes University of Toronto researchers, whose researchers have been involved in developing two exciting new models for drug research and development.
“Open science drug discovery, a global movement led by academic scientists in Toronto that puts knowledge sharing and medication affordability ahead of patents and profits,” their press release goes on to explain. “Medicines 4 Neurodegenerative Diseases (M4ND Pharma) will pursue promising new genetic drug targets for these intractable nervous system disorders, thanks to $1.5 million from the Krembil Foundation. It will be the world’s second drug discovery company committed to open science after Medicines 4 Kids (M4K Pharma), which launched in 2017 to develop a novel drug for an uncommon but fatal childhood brain cancer.”
Open Science projects are developing quickly across the academy and across the globe. Defining Open Science as “the future of science and science for the future,” UNESCO is actively pursuing efforts to promote and advance Open Science as one of the UN’s Sustainable Development Goals. The possibilities are truly mind-boggling. So much so, that much of these changes are happening from a generation that has grown up with technology, the easy communication through the web and new perspectives on their work that is guided less by tradition than by possibility.
Psychologist Katie Corker’s closing remarks from the Society for the Improvement of Psychological Science 2018 meeting are instructive: “I want us to think hard about classifying Open Science as a behavior. Not as an identity. Not as a value. It is a set of practices that you do in order to make your work transparent to others, checkable and scrutinizable by others in the community.”
Corker went on to explain: “If Open Science is a behavior that means it is not necessarily excessively stable – practices can obviously vary from project to project. We’ll have to resist the very strong urge to heuristically classify researchers as Open or Not Open and the desire to award quality points accordingly. We’ll have to judge each project, each study, on its own merits, including the presence and implementation of various open practices. It might seem simple – evaluate the work and not the person who did it – but if there’s anything this movement has done, it is to show us how challenging this actually is.”
OPEN SCIENCE TAKING CENTER STAGE
Formal structures and alliances are being brought together to promote, encourage and connect researchers in this new frontier for science. The Open Science Foundation is a “mission-driven non-profit” providing a free, open source service, the Center for Open Science, committed to “aligning scientific practices with scientific values by improving openness, integrity and reproducibility of research.” This isn’t the only organization established in this effort, but an excellent example of the breath, depth and commitment that proponents have the goal of re-conceiving the entire scholarly research enterprise.
All transitions and new technologies create both change/new opportunities as well as potential power/procedural shifts. This movement has actually been years in the making with the expansion of communication capabilities with the rise of the internet, the global expansion of research with the explosion of higher education and commercial ventures across the globe, and the troubled relationship between commercial and scholarly publication venues. The OSF solution is focused on creating “a scholarly commons to connect the entire research cycle.”
These efforts aren’t limited to western institutions. The Open and Collaborative Science in Development Network (OCSDNet) is an international effort “composed of twelve research-practitioner teams from the Global South interested in understanding the role of openness and collaboration in science as a transformative tool for development thinking and practice.” This effort uses a collaborative design that pairs these research teams with a team of External Advisors and a Network Coordination Team and is funded by research institutions from Canada and the United Kingdom.
Crowdsourced efforts to create freely available research tools, reference sources and in building data collections that would never exist without the efforts of volunteers, funders and dedicated scientists – from Arcbazar to Zooniverse. Last year Lettie Y. Conrad provided a good overview of Open Science Tools for the Scholarly Kitchen, detailing and categorizing available tools across the board.
“NATIVE TO THE WEB – JUST LIKE OUR CURRENT GENERATION OF RESEARCHERS!”
Today science is being redefined, largely through the rise of new web-based tools and the work of a new generation of researchers. As Matteo Cantiello remarks on the development of Authorea, “we envisioned all of the things that researchers want to do as they write a paper—collaborate with co-authors, pull citations from web libraries, insert data, submit to journals—and then built it into a native web-app. Native to the web—just like our current generation of researchers.”
This new generation of innovators has grown up with sophisticated technology and complex communication methods. They are facile with technology and have been using it all of their lives. It was 1990 – almost thirty years ago – that the World Wide Wide was born and Adobe Photoshop was released. And innovations and change have been happening at leaps and bounds ever since.
Celebrating the 50th anniversary of one of our University of Minnesota libraries, a manual typewriter was brought out during the open house. Students looked at it with amazement, with many saying they had never seen one before. Today’s new generation of researchers are used to the continuous change and development of web-based products, and have learned to expect rapid change and ever increasing, more sophisticated options in all aspects of their lives.
In fact, it is this new generation that is busy creating the state-of-the-art solutions to the full range of processes in the research and publication cycle. In the next part of this series we will take a closer look at one of the new entrepreneurs dedicated to remaking the research and publishing cycle.
AN X-SCITE-ING LOOK INTO THE FUTURE OF TRULY OPEN DATA
None of these new and important ventures would be possible without the development of more and better machine learning algorithms that are able to pull key information from the growing stores of online repositories that are holding the Open Access contents of journal articles and other scholarly works. This corpse is allowing for the development of far more sophisticated analysis an access to complex queries into the contents of scholarly works, citation analysis and information discovery. And, with the increasing requirements for OA as well as the increasing numbers of quality OA journals, we can expect to see even more development of these free, open databases of information in the future.
One new product that I am particularly intrigued with is Scite, which has only been in development since June 2018, which provides an amazing look into the future promise of research information. The company describes it not as a product or database, but as “a platform that uses deep learning, natural language processing, and a network of experts to identify and promote reliable research by evaluating the veracity of scientific claims.”
Details on the system design are not readily available, but as of July 1st, 305,972,156 total citations had been extracted from “millions of scientific articles.” The database uses eleven machine learning models with 20-30 features each to extract citation information. Once the citations are extracted, their deep learning model is used to classify each cited reference as supporting, contradicting or just mentioning the original article. Deep learning is really just another phrase for artificial intelligence programs that are able to learn as new relationships, definitions or changes are added to some database. By being able to take in new data and changes, the system is able to self-correct more easily – although the Scite model also uses hundreds of volunteers who double check each entry.
“When looking at a scientific paper in general,” Scite.ai’s Joshua Nicholson reports, “we look at who the authors are, where they are from, where they published the work, and various proxies of impact (citations, views, and social media shares). None of these actually consider the quality of the article. We’re trying to change that at scite.ai and are happy to release an improvement on our search results page that allows you, with just a glance, to tell if a scientific paper has been supported or contradicted. We’ll soon be releasing a plugin so this information is present anywhere you’re reading an article on the web!”
Recently, in answer to a Facebook query, Nicholson reported that “citation statements are classified using a deep learning model that has been trained on ~40k manually annotated snippets. Extraction of the citations and statements requires 10 different machine learning models as well,” Nicholson continues. “Accuracy varies across classifications and is measured in a few ways but in short is ~.8 for precision (contradicting and supporting) and .98 for mentioning. We’re ingesting 300k PDFs a day too so coverage should get a lot better in the coming days/weeks.” An amazing production level for such a young start-up!
The screen shot below of one of the records in Scite shows what an amazing advance over existing citation databases Scite truly represents.
NICHOLSON: A SERIAL INVENTOR
Scite is not Nicholson’s first venture. While still a doctoral student at Virginia Tech, he developed an Open Access online scholarly publishing platform called Winnower, which one reviewer called “an essential tool for open research. It has elevated open notebook science to unprecedented levels. The Winnower is a game changer in academic publication.” After just two years in existence, Winnower was acquired by Authorea, another significant collaborative reading/writing/publishing platform. Nicholson went along with the purchase, becoming Authorea’s Chief Research Officer. Not bad for a biologist just two years post-grad school!
Clearly we are seeing a new generation of engaged researchers and entrepreneurs who are working to change the process of research forever. In Part 2 of this series, we focus the example of Joshua Nicholson, a talented biologist who exemplifies this new generation that grew up with the internet, care deeply about the future of research and are creatively changing the world of research.
Nancy K. Herther is Anthropology/Sociology Librarian at the University of Minnesota, Twin Cities campus. [email protected]