Dr. Becky Upton, president of the Pistoia Alliance
The life science industry is rapidly embracing AI, and a recent survey from the Pistoia Alliance found that 83% of experts use generated AI in their workflows. This surge isn’t surprising given the possibility of AI to accelerate the process across Pharma and clinical research. From resource-intensive R&D processes such as lead identification and optimization to power analysis of complex genetic and real-world patient data sets, AI is increasingly recognized as a powerful tool to enhance human-driven work .
However, as AI adoption grows, there are also regulatory and compliance concerns. A majority (85%) of respondents cited AI laws and regulations as the main challenges of adoption and cross-border collaboration. In particular, copyright rules are often overlooked or misunderstood. The accuracy and reliability of an AI model depends on being trained with legally compliant data responsibly supplied from a combination of sources. However, licenses for these datasets do not allow them to be used in AI training, and can pose confusion and potential legal risks.
To ensure that AI is used safely, securely and intricately, Pharma and clinical research institutions need to work together to address the challenges of copyright and data sharing. To explore these challenges, we gathered experts in this field from alliance membership to highlight key risks for responsible AI adoption and outline best practices.
Why copyright compliance is important
AI models are trained on a variety of scientific data sources. He is the Chief Technology Officer of the Copyright Clearance Centre for PhD Babis Marmanis, who holds a PhD, explains: Training and fine-tuning the AI model in Pharma R&D includes valuable scientific literature, and their use includes the creation and storage of copies of these documents. Often, the appropriate prompt, part, or entire original work can be recalled by the end user, creating derivatives or very similar output. Without a proper license, this poses the risk of copyright infringement. ”
Our experts divided these risks into two broad categories. It’s legal and well-reputed. Chris McSpiritt, vice president of Life Sciences Strategy at Domino Data Lab, warns that in addition to copyright infringement, patent infringement is a major concern. The drug is patented based on its molecular structure. If the AI model proposes treatment candidates that have been patented by competitors due to unintended inclusion in a third-party dataset, this could lead to inadvertent patent infringement. Without protection, companies may unconsciously invest in R&D for products that cannot be legally developed. McSpiritt points out that many organizations already have a process to check for patent conflicts, and the introduction of AI has reinforced the importance of proficiency in this field.
McSpiritt also cites reputational risks, claiming that pharma and clinical organizations rely on trust from industry partners and the public. If a company is found to use copyright, patent, or confidential material without permission, it could face resentment, lose opportunities for future collaboration, and undermine the public’s confidence in its treatment.
Overall, ensuring compliance and using high quality datasets curated from the start not only prevent legal and reputational challenges, but also improve AI-driven research findings, as Marmanis points out. I will.
AI and Copyright: Three Key Issues
The Pistoia Alliance study identified three copyright-related hurdles that need to be addressed to ensure that AI is used in life sciences in complex and responsible ways.
First, there is a significant regulatory complexity as AI copyright laws continue to evolve and vary widely across regions. For example, EU AI law introduced a copyright transparency obligation. This requires the AI model provider to publish a detailed summary of the content used to train the model. These regulations aim to increase transparency, but add to the already complex web of compliance requirements for pharmaceutical and clinical research institutions working across borders. Furthermore, many countries do not have a legal framework that explicitly defines whether AI can copyright the output generated. For example, the US Copyright Office explains that works created solely by AI without human authors are not subject to copyright protection. This ambiguity creates even more uncertainty. How can companies protect AI-driven works if AI-generated content cannot be protected by copyright?
Secondly, there is confusion in the applicability of copyright. Of the life science experts who voted, 42% do not consider copyright before sharing or using third-party information with AI tools. This is a gap that can lead to risk of breach. Graeme King, VP, and Amino Data Head of Data & AI Governance added that many organizations also do not consider confidentiality rules related to clinical trial reports and drug performance data, particularly those related to.
Thirdly, and finally, a lack of expertise remains a barrier, with only 40% of organizations reporting that they have a dedicated team or expert focused on AI copyright compliance. While expert knowledge about data licensing, text mining rights, copyright and IP law is becoming essential, these skills are difficult to acquire and the competition for employment professionals is fierce.
Navigating Copyright Compliance: 4 Tips from Experts
Experts shared the following advice for organizations to navigate the complexities of evolving copyright and AI regulations:
Know the copyright rules for your data and data sources. “It’s important that organizations use well-licensed AI models and data sources,” says Will Fitzhugh, senior consultant at BPRescient. King said examples of Pharma data that could present copyright challenges include literature and patent data, published research papers, meeting procedures, and post-market surveillance data. In clinical research, his examples include electronic health records, imaging libraries, clinical trial results, and real-world evidence. Investing in Expertise: Fitzhugh also points out the importance of consulting with legal teams through the development and deployment of AI models. He said, “Good practices should include ensuring that the data used for training is not restricted to its use and that the base model is properly licensed.” Establish the work: McSpiritt notes the importance of mitigating risk by auditing governance frameworks and source datasets and models. He says these should be added in addition to the current processes that organizations already have on legal and copyright issues. Secure and Test AI Models: Fitzhugh argues that fine-tuning of all AI models should be carried out in a secure local or cloud environment. Strict testing helps to validate the output, evaluate the performance of the model, and ensure reliability.
Promote innovation through responsible AI
AI and copyright compliance are complex and evolving challenges that require interdisciplinary collaboration between data scientists, pharmaceutical, clinical professionals, and regulatory and legal experts. To ensure researchers can take responsibility and use AI effectively, the industry must develop best practice guides, harmonize data standards, and work together to invest in training on copyright compliance . Those interested in navigating regulatory uncertainty, reducing copyright risks, and unlocking the maximum potential of AI is invited to join the Pistoia Alliance AI community .
About the author:
Dr. Becky Upton was appointed the first female president of the Pista Alliance in June 2022. She is a longtime advocate for pre-competitive collaboration in life science and healthcare research and development, and is passionate about the diversity of STEM, as it plays an important role in advance of science. Becky is responsible for leading the Pista Alliance strategy and defining its future in areas more important to the industry, including data standards, emerging technologies, diversity and inclusion, sustainability, and precision medicine. Becky holds a PhD. Biochemistry at Imperial College and MBA at Cranfield University.