The growing amount of digitally available research data and publications enables researchers to search and analyse these sources with the help of special software (text and data mining, TDM). The use of TDM techniques is already of great importance in some research fields (such as bio-genetics and linguistics) and interest in these technologies is growing rapidly.
Usually, authors are obliged to transfer their copyrights before publication, as a result of which the scientific community also relinquishes control over the way in which its publications are used. It has not been possible thus far to mine freely in legally accessed content made available by academic publishers. This obstructs science itself, including the distribution of scientific knowledge beyond the scientific community, and also impedes the use of TDM by private parties, especially SMEs, depriving them of the ability to explore new market possibilities. This ultimately hinders innovation.
- Reform copyright legislation to allow text and data mining for academic purposes and preferably also for societal and commercial purposes for users who already have legal access to content.
- Encourage researchers not to transfer the copyright on their research outputs before publication.
- European Commission: put forward proposals for copyright reform during 2016, so as to facilitate the use of TDM for academic purposes and preferably also for societal and commercial purposes.
- National authorities, national parliaments, European Commission, Council and European Parliament: adopt and implement rules and legislation that make TDM easier for academic purposes and preferably also for societal and commercial purposes.
- Research funders and Research Performing Organisations: actively stimulate authors to retain control over their research output, including articles and books. This can be achieved by setting preconditions for funding and by introducing licensing systems.
- Publishers: allow text and data mining for users of their content who already have legal access, and expose content in a structured and machine actionable way.
Expected positive effects
- Broader uptake of new analysis techniques, especially in the area of big data;
- Reduced costs for scientific work in the area of TDM/big data;
- Advancement of science, new solutions for societal challenges and more innovation.