Skip to end of metadata
Go to start of metadata



The problem

The growing amount of digitally available research data and publications enables researchers to search and analyse these sources with the help of special software (text and data mining, TDM). The use of TDM techniques is already of great importance in some research fields (such as bio-genetics and linguistics) and interest in these technologies is growing rapidly.

Usually, authors are obliged to transfer their copyrights before publication, as a result of which the scientific community also relinquishes control over the way in which its publications are used. It has not been possible thus far to mine freely in legally accessed content made available by academic publishers. This obstructs science itself, including the distribution of scientific knowledge beyond the scientific community, and also impedes the use of TDM by private parties, especially SMEs, depriving them of the ability to explore new market possibilities. This ultimately hinders innovation.


The solution

  • Reform copyright legislation to allow text and data mining for academic purposes and preferably also for societal and commercial purposes for users who already have legal access to content. 
  • Encourage researchers not to transfer the copyright on their research outputs before publication. 


Concrete actions

  • European Commission: put forward proposals for copyright reform during 2016, so as to facilitate the use of TDM for academic purposes and preferably also for societal and commercial purposes.
  • National authorities, national parliaments, European Commission, Council and European Parliament: adopt and implement rules and legislation that make TDM easier for academic purposes and preferably also for societal and commercial purposes.
  • Research funders and Research Performing Organisations: actively stimulate authors to retain control over their research output, including articles and books. This can be achieved by setting preconditions for funding and by introducing licensing systems. 
  • Publishers: allow text and data mining for users of their content who already have legal access, and expose content in a structured and machine actionable way. 


Expected positive effects

  • Broader uptake of new analysis techniques, especially in the area of big data;
  • Reduced costs for scientific work in the area of TDM/big data;
  • Advancement of science, new solutions for societal challenges and more innovation.
  • No labels

11 Comments

  1. Anonymous

    The participants of the innovation track of the Presidency Conference on Open Science provided their feedback on the Amsterdam Call for Action on Open Science via https://trello.com/b/gjLeuMuM/amsterdam-call-for-action-on-open-science-draft. On Action item 2, the following remarks were made: 

    • TDM refers also to data,web, social media content. 
    • Take 'preferably also for societal and commercial purposes; out 
    • Add libraries as stakeholders: Libraries: raise awareness, support
    • Change "Make TDM of content easier" to "legalise TDM for all purposes". 
    • Change "for academic purposes and preferably also for societal + comm. purposes" to "for all purposes". 
    • After "made available by academic publishers" add "copyright barriers to TDM exist for all legally accessed content not just that made available by publishers. Transaction costs of licensing are prohibitive, therefore an EU-wide copyright exception for TDM for all purposes is needed".  
    • Change "proposals for copyright reform" to "A proposal for a mandatory, harmonised exception for TDM for all purposes"    
    • Clarify TDM is not copyright protected usage 
    • We can't promote citizen science if we limit TDM legalization to academics/institutions
    • A problem with action 2 from citizen science and communications science perspective: It assumes that all TDM is performed on content "made available by academic publishers". This is only a part of the problem. If e.g. I want to mine social media or blogs in a linguistic study, I have the same copyright issues.

    On behalf of our participants, Marina Noordegraaf

     

  2. Anonymous

    LIBER /Kristiina Hormia-Poutanen

    The Solution

    Reform copyright legislation to allow text and data mining for academic purposes and for societal and commercial purposes for users who already have legal access to content. A mandatory and non-overridable copyright exception for text and data mining (TDM) for both commercial and non-commercial activity is needed. By modernising European copyright laws to support TDM, researchers will be enabled to make new discoveries and, in turn, to help drive science, competitiveness and innovation.

    • Encourage researchers not to transfer the copyright on their research outputs before publication. 
    • TDM can be made easier by putting local support services in place and provining public infrastructure for TDM
    • Concrete actions

    LIBER and research libraries: Advocate for the copyright reform at the European Commission and Parliament and at national level. Raise awareness of TDM among researcher students and librarians, run EU-projects to show examples of the benefits of TDM also in public-private collaborative research.

  3. Anonymous

    Re: 2nd solution:

    If this is a solution, it is not only a solution to the TDM issue.

    Putting this recommendation here looks odd!

     

    Georg Botz

  4. Anonymous

    Re: expected positive effects (2nd item):

    This seems to imply that currently costs are the main barrier to TDM.

     

    Georg Botz

  5. What is ignored here is the issue of third-part copyrights held by non-profit organisations (such as museums and archives, publishers of literary works) that rely partly (and in some cases quite substantially) on the income generated by these rights.

    Opening up scientific outputs that contains this kind of copy-righted material for text and data mining creates a problem; the combination of such an opening up with open access to the publications would destroy these institutions from a source of income they rely on.

    So, any reform of copyright legislation that does not acknowledge this reality would be detrimental for these organisations; a proper solution would require their funders (national, regional and local authorities) to make up for the loss of income.What is ignored here is the issue of third-part copyrights held by non-profit organisations (such as museums and archives, publishers of literary works) that rely partly (and in some cases quite substantially) on the income generated by these rights.

    Opening up scientific outputs that contains this kind of copy-righted material for text and data mining creates a problem; the combination of such an opening up with open access to the publications would destroy these institutions from a source of income they rely on.

    So, any reform of copyright legislation that does not acknowledge this reality would be detrimental for these organisations; a proper solution would require their funders (national, regional and local authorities) to make up for the loss of income.

     

    Martin Stokhof, on behalf of the Open Access Working Group of the European Research Council

     

    Proposed changes:

     

    The problem

    The growing amount of digitally available research data and publications enables researchers to search and analyse these sources with the help of special software (text and data mining, TDM). The use of TDM techniques is already of great importance in some research fields (such as bio-genetics and linguistics) and interest in these technologies is growing rapidly.

    Usually, authors are obliged to transfer their copyrights before publication, as a result of which the scientific community also relinquishes control over the way in which its publications are used. It has not been possible thus far to mine freely in legally accessed content made available by academic publishers. This obstructs science itself, including the distribution of scientific knowledge beyond the scientific community, and also impedes the use of TDM by private parties, especially SMEs, depriving them of the ability to explore new market possibilities. This ultimately hinders innovation.

    Another problem that needs to be solved is that of copyrights held by third parties that are non-profit organisations (such as museums and archives) or small business (such as literary publishers) that rely partly, and in some cases substantially, on the income generated by such rights. Opening up scientific outputs that contains this kind of copy-righted material for text and data mining creates a problem; the combination of such an opening up with open access to publications would destroy a source of income these institutions rely on.


    The solution

    • Reform copyright legislation to allow text and data mining for academic purposes and preferably also for societal and commercial purposes for users who already have legal access to content. 
    • Ensure that any negative effects on non-profit organisations are off-set by appropriate measures regarding their funding structures
    • Take appropriate measures to off-set the negative effects for small business.
    • Encourage researchers not to transfer the copyright on their research outputs before publication. 

    Concrete actions

    • European Commission: put forward proposals for copyright reform during 2016, so as to facilitate the use of TDM for academic purposes and preferably also for societal and commercial purposes.
    • National authorities, national parliaments, European Commission, Council and European Parliament: adopt and implement rules and legislation that make TDM easier for academic purposes and preferably also for societal and commercial purposes.Take appropriate measures to off-set the negative effects for non-profit organisations and small business.
    • Research funders and Research Performing Organisations: actively stimulate authors to retain control over their research output, including articles and books. This can be achieved by setting preconditions for funding and by introducing licensing systems. 
    • Publishers: allow text and data mining for users of their content who already have legal access, and expose content in a structured and machine actionable way. 

    Expected positive effects

    • Broader uptake of new analysis techniques, especially in the area of big data;
    • Reduced costs for scientific work in the area of TDM/big data;
    • Advancement of science, new solutions for societal challenges and more innovation.

     

     

  6. Anonymous

    I'd like to suggest that the concrete action be made clearer, as follows:

    Concrete actions

    European Commission: put forward proposals for copyright reform during 2016, so as to enable anyone who has lawful access to text and datamine that content for all purposes, including commercial use.

    Robert Kiley, Wellcome Trust

  7. Anonymous

    Response on behalf of Creative Commons Europe : 

    We know that several organisations responding to the consultation on the review of the EU copyright rules have called for the adoption of a specific exception for text and data mining.

    Another view holds that text and data mining activities should be considered outside the purview of copyright altogether, therefore the introduction of an exception would not be required to engage in TDM. In other words, TDM should be considered as an extension of the right to read (“the right to read is the right to mine”). However, as others have pointed out, the fact that the InfoSoc and Database directives have not been implemented uniformly across all Member States indicates a need to adopt a pan-European exception in order to provide clarity to those wishing to conduct TDM.

    If this group decides to recommend an exception for TDM, such an exception should cover any purpose, not just “for academic purposes.” A TDM exception should explicitly permit commercial activity. In addition, terms of use, contractual obligations, and/or TPM (Technological Protection Measures)  imposed by scholarly journal publishers can be a barrier to text and data mining. These mechanisms that attempt to prohibit the lawful right to conduct TDM on content accessed legitimately should be forbidden. The exception adopted in the UK explicitly states that TDM cannot be limited by contractual terms.


    Gwen Franck, Regional Coordinator Europe
  8. Anonymous

    Text and data mining should be possible for all legally accessible data, for all purposes.

    Inge Van Nieuwerburgh for VLIR, the Flemish Interuniversity Council

  9. Anonymous

    Michael Matlosz, President Science Europe:

    The problem has been perfectly identified and described and clearly the solution has to allow content mining (including text and Data Mining, TDM) for commercial and non-commercial scientific research purposes. This requires changes to outdated copyright rules.

    Researchers should never have to give up copyright of their research outputs, neither before nor after publication. I strongly suggest rewording the description of the solution to reflect this.

    As tangible action for public research organisations is to make it a minimal standard demand that authors retain copyright of their work, in line with the Science Europe principles on Open Access (http://www.scienceeurope.org/uploads/PublicDocumentsAndSpeeches/WGs_docs/SE_POA_Pos_Statement_WEB_FINAL_20150617.pdf )

    As legal certainty is a key issue with regards to content mining, I recommend being very precise in the wording and removing wording such as ‘preferable’ that may lead to ambiguous interpretation.

    Finally, the FAIR principles should be explicitly mentioned here and a reference made to section 5  where they are discussed in detail.

  10. Anonymous

    Natalia Manola on behalf of OpenAIRE.

    Discovering new knowledge from publicly funded research is one of the spearheads of Open Science. TDM technologies, being ahead of the organizational and legal barriers, must be freely shared and applied by all.

    Some additional points on the proposed actions and the community comments (which already address the copyright reform, but do not address some of the technical aspects):

    Solution:

    • TDM must be enabled for all, not only for research purposes. If an exception is a first step towards lifting imposed barriers, then make it a pan-European one, with clear rules among member states, and among all different types of stakeholders.
    • Lower technological and access barriers. TDM often requires big data technologies and specialized know-how, even if content becomes freely available. Encourage publicly governed hubs of TDM scientific content and services to be accessed freely by everyone. 

    Concrete actions:

    • Replace "publishers" by "data providers" as other data providers should be subject to this rule.
    • Reinforce content hubs for TDM on the national and European levels to lower the access barriers: build on existing repository e-Infrastructures (e.g., OpenAIRE or CORE for repositories, PMC Europe for OA publishers, emerging ones for data) to expose content in a structured and machine actionable.
    • Explore, assess and utilize TDM service oriented e-Infrastructures  (e.g., H2020 OpenMinted - www.openminted.eu) to lower technological barriers for all users.
  11. Anonymous

    We note that suitable open file formats should be considered in order to facilitate text and data mining, and that a means for the citation for mined data in scholarly work, and where possible the preservation of a dynamic dataset in the mined state to aid reproducibility, should be developed.

    Jisc commissioned work on text mining in 2012 (https://www.jisc.ac.uk/sites/default/files/value-text-mining.pdf) which identified the benefits of this form of research and informed the development of the text and data mining exemption in UK copyright law.

     We strongly support the actions in this area and note that the utility of exception in the UK will be furthered by a cross European exception. Some of the practical implementation issues are set out in this short briefing: https://www.jisc.ac.uk/guides/text-and-data-mining-copyright-exception

    David Kernohan, Jisc, UK