Fostering Trust in Clinical Trials: The Power of Voluntary Data Sharing

In the ever-evolving landscape of medical research, a quiet revolution is transforming how we approach clinical trials. Voluntary data sharing is emerging as a powerful strategy that promises to enhance transparency of medical science, enhance research efficiency, and ultimately improve patient outcomes.

Clinical Data Sharing: Regulatory Required vs Voluntary

Regulatory clinical data sharing, mandated by policies like EMA Policy 0070 and Health Canada PRCI, requires pharmaceutical companies to anonymize and redact clinical data in documents before public disclosure, ensuring patient privacy while promoting transparency. These regulations establish strict guidelines for data protection, submission timelines, and compliance measures, making them a non-negotiable aspect of regulatory approval processes. In contrast, voluntary clinical data-sharing initiatives led by clinical trial sponsors are driven by commitments to scientific collaboration, innovation, and trust-building.

Unlike regulatory mandates, voluntary sharing allows sponsors greater flexibility in determining what data to share, with whom, and under what conditions. While both approaches aim to advance medical research and enhance transparency, regulatory policies impose standardized, enforceable frameworks, whereas voluntary initiatives have historically relied on industry best practices and ethical considerations.

The Benefits of Voluntary Clinical Data Sharing & Kudos to Those Leading the Way

Voluntary clinical data sharing is a powerful force driving medical innovation, scientific collaboration, and patient trust. Unlike mandated disclosures, voluntary initiatives allow sponsors to proactively share insights, fostering transparency and accelerating the development of new treatments. By enabling independent researchers to analyze trial data, these efforts can lead to new discoveries, validate findings, and even uncover potential safety signals earlier. This open exchange strengthens public confidence in clinical research, demonstrating a commitment to ethical responsibility beyond regulatory requirements.

Kudos to the forward-thinking organizations and sponsors who embrace voluntary data sharing! Their leadership not only enhances scientific progress but also sets a gold standard for integrity and trust in the industry. By choosing to go beyond compliance and put patient-centered research first, they are shaping the future of healthcare for the better.

The Untapped Potential of Clinical Trial Data

Every clinical trial represents a significant investment of time, resources, and human participation. Traditionally, these studies were viewed as isolated research efforts, with data typically used to answer a single primary research question. However, this approach leaves tremendous potential unexplored. Each clinical trial contains a wealth of information that could provide insights far beyond its original scope.

The motivations for data sharing are multifaceted and compelling:

Advancing Scientific Knowledge

Individual participant-level data can be a goldmine for researchers. By making this data available, scientists can:

Conduct comprehensive meta-analyses
Develop more sophisticated statistical methods
Design more targeted future clinical trials
Explore research questions not originally anticipated in the initial study design

Ethical Considerations and Participant Contributions

Most clinical trial participants volunteer with a profound hope: to contribute to medical knowledge and potentially help future patients. When data remains siloed, this noble intention is only partially realized. Data sharing ensures that each participant's contribution has the maximum possible impact.

Research Efficiency and Innovation

Data sharing eliminates redundant research efforts. Instead of repeatedly conducting similar studies, researchers can build upon existing knowledge, accelerating scientific discovery and reducing unnecessary resource expenditure.

Increasing Transparency and Trust

In an era of increasing skepticism towards scientific research, data sharing represents a powerful tool for rebuilding public trust. By opening up research processes, the scientific community demonstrates commitment to accountability and transparency.

Key Considerations for Implementing a Data Sharing Program

Organizations looking to develop robust data sharing initiatives should consider:

Policy Development

Clearly define which studies will be shared
Establish transparent criteria for data access
Identify potential exceptions (e.g., language barriers, anonymization challenges)

Data Package Components

A comprehensive data sharing package typically includes:

Final study protocol with amendments
Detailed data dictionary
Statistical analysis plan
Clinical study report summary
Anonymized individual patient-level data
Anonymization report

Practical Challenges and Solutions

While data sharing offers immense potential, it's not without challenges:

Protecting patient privacy
Managing complex legal agreements
Ensuring data quality and consistency
Developing robust technological infrastructure

Sharing platforms like Vivli and solution providers like Real Life Sciences have emerged to address these challenges, together they provide an end-to-end solution for clinical trial sponsors looking to share their data.

The Broader Impact

Voluntary data sharing is more than a technical process—it's a cultural shift in medical research. By embracing this approach, we:

Maximize the value of every research dollar
Respect and maximize clinical trial participants' contributions
Accelerate medical innovation
Build public confidence in scientific research

Looking Ahead

As technology advances and collaborative research models become more sophisticated, data sharing will become the norm rather than the exception. Emerging technologies like advanced anonymization techniques and secure data platforms will continue to lower barriers to meaningful data exchange. The power of voluntary data sharing extends far beyond individual studies. It represents a fundamental reimagining of how medical research can create value—not just for individual researchers or institutions, but for global health and human understanding.

By breaking down silos, promoting transparency, and treating each clinical trial participant's contribution with the utmost respect, we can unlock unprecedented potential in medical research. The future of clinical trials is collaborative, transparent, and driven by a shared commitment to advancing human health. To learn more about technologies to safely share data, visit rlsciences.com

Unlocking the Power of Quantitative Anonymization for Clinical Trial Data

In the ever-evolving landscape of clinical research, the need for transparency and data sharing has become paramount. As regulatory bodies like Health Canada and the European Medicines Agency (EMA) continue to emphasize the disclosure of clinical trial data through regulation and policy, sponsors are faced with the critical challenge of anonymizing information while preserving its utility. This delicate balance is at the heart of any research team’s decision process between qualitative and quantitative approaches to data anonymization.

The Limitations of Qualitative Anonymization

Traditionally, the qualitative approach has been the go-to method for clinical trial data transparency. This approach relies on the application of static and subjective rules to redact personal data found within documents such as Clinical Study Reports (CSRs), Protocols and Statistical Analysis Plans. Although this method appears straightforward, it may not fully meet the increasing demands for transparency and data utility and the risk of re-identification of participant data is unknown and not measurable.

The qualitative approach is inherently subjective, with decisions made based on the contextual review and judgment of the individuals involved. This can lead to inconsistencies and a lack of measurable outcomes, making it challenging to satisfy the requirements of regulatory bodies. Moreover, the heavy reliance on redaction in the qualitative methodology can result in significant information loss, limiting the value and usability of the anonymized data.

The Rise of Quantitative Anonymization

In contrast, the quantitative approach to clinical trial data anonymization offers a more sophisticated and data-driven solution. This empirical methodology leverages statistical analysis and privacy models to anonymize data while preserving as much utility as possible.

At the heart of the quantitative approach is the definition of a risk threshold, which serves as a measurable target of acceptable risk of re-identification for the anonymization process. By applying privacy models like k-anonymity, the quantitative method groups participants based on similar characteristics, ensuring any one individual is not distinguishable from others within a dataset.

The advantages of this approach are manifold. Firstly, the quantitative methodology provides a clear and measurable risk of re-identification, a crucial requirement for health authorities that are increasingly favoring this more empirical approach. This level of transparency and accountability resonates with regulatory bodies and demonstrates the sponsor's commitment to patient privacy.

Secondly, the quantitative approach aims to strike a delicate balance between data utility and privacy protection. By leveraging advanced anonymization techniques, such as pseudonymization, generalization, and categorical suppression, the quantitative method can transform the data in a way that preserves its analytical value while still safeguarding individual confidentiality.

Managers within clinical trial sponsors prefer the quantitative methodology due to its empirical and measurable benefits.

The 6-Step Quantitative Anonymization Process

Defining the Privacy Model and Risk Threshold: The first step involves establishing the framework for the anonymization, including the selection of a privacy model (e.g., k-anonymity) and the definition of a risk threshold (e.g., 9% risk of re-identification).
Determining the Reference Population: Sponsors must decide whether to use the study population or a larger, similar reference population to enhance the anonymization process. The reference population can help reduce the equivalence class size, allowing for more granular data transformations while still adhering to the risk threshold.
Applying Anonymization Techniques: The quantitative approach tailors the anonymization techniques to the specific data types. This may include pseudonymizing subject IDs, generalizing age into hierarchical bands, and applying categorical suppression for variables like country.
Evaluating Anonymization Rules and Data Utility: The sponsor must prioritize the preservation of data utility while ensuring that the anonymization rules adhere to the defined risk threshold. This may involve filtering anonymization options based on information loss or applying suppression limits to balance data utility and privacy protection.
Analyzing Adverse Events: Adverse events are a critical component of clinical trials, and the quantitative approach recognizes their importance. A specialized process should be implemented to ensure the retention of clinically relevant adverse events, even if they do not meet the strict statistical criteria.
Assessing Final Residual Risk: The final step involves analyzing the total residual risk and ensuring that the results meet the required metrics for the anonymization report. This comprehensive assessment provides a clear understanding of the remaining risk, allowing sponsors to make informed decisions and satisfy regulatory requirements.

The Role of Technology and Automation

A key advantage of the quantitative approach is its reliance on technology and automation. Rather than manually applying redaction rules, sponsors can leverage specialized software like RLS Protect to perform the complex statistical analysis, configure anonymization scenarios, apply the anonymization techniques throughout the clinical documents and generate the required anonymization reports as expected by the health authority

This level of automation not only streamlines the process but also enhances its repeatability and scalability - crucial considerations as sponsors navigate an increasing number of transparency-related projects in support of their R&D pipelines. By offloading the heavy lifting of data transformation and risk assessment to specialized and purpose-built software, sponsors can focus on the strategic aspects of the anonymization process, ensuring that the final results meet regulatory requirements while preserving the maximum possible data utility all while providing the opportunity for their internal teams to focus on critical path activities.

The integration of technology also introduces an element of consistency and objectivity that can be challenging to achieve with a purely manual, qualitative approach. The automated tools apply the defined anonymization techniques and risk thresholds systematically, reducing the potential for human error or subjective decision-making that can undermine the integrity of the anonymized data.

A Comprehensive Guide to Applying Qualitative Methodology in Clinical Trials Data Anonymization

In today’s data-driven landscape, the demand for transparency and the exchange of clinical trial data has grown exponentially. While this shift opens doors to more robust research and collaboration, it also presents unique challenges in safeguarding the privacy and confidentiality of trial participants and commercially confidential data. Balancing the need to protect individual privacy while retaining the clinical value of shared data is critical. One approach to navigate this challenge is through data anonymization, specifically using the qualitative methodology.

In this in-depth guide, we will explore the nuances of qualitative anonymization in clinical trials, covering key principles, best practices, and critical considerations to help you apply it effectively. The goal is to help researchers strike the delicate balance between patient re-identification risks and retaining the utility of clinical trial data.

What is Qualitative Anonymization?

Before diving into the application of qualitative anonymization, it’s essential to understand what it entails. Unlike quantitative anonymization, which relies on measurable statistical analysis to ensure data anonymity and preservation of data utility, qualitative anonymization is based on a combination of a set of rules, judgment, expert knowledge, and a case-by-case review of sensitive information. This method introduces subjectivity, meaning researchers must apply a flexible and context-driven approach to protect participant data.

The goal of data anonymization is twofold:

To minimize the risk of participant re-identification.
To preserve the utility of the data for meaningful clinical insights.

The qualitative anonymization process involves defining rules for handling personally identifiable information (PII) and other sensitive data points within clinical trial documents. Given that no statistical models are used in the qualitative approach, the effectiveness largely depends on human expertise, manual review, and contextual understanding.

Key Considerations When Applying Qualitative Anonymization

A well-executed qualitative anonymization process begins with a firm understanding of several core considerations. These guiding principles ensure that data is anonymized appropriately while still retaining its clinical value. Below are the five key considerations to keep in mind:

1. Contextual Judgment

In qualitative anonymization, contextual judgment is critical. Unlike quantitative methods, which rely on automated algorithms or statistical models, qualitative anonymization involves subjectivity. This means researchers must make informed decisions on what data to anonymize, retain, or generalize based on the context of the trial.

Each clinical trial is unique. The identifiers in one study may not pose the same risks as in another. For example, a trial focused on a rare disease could make even minor personal details highly identifying, whereas the same information might pose less risk in a more common disease setting.

Researchers must ensure that the anonymization rules they apply are tailored to each trial, identifying sensitive data and making informed decisions about how to handle it. Contextual judgment helps protect participant privacy while retaining relevant data that contributes to the study’s overall integrity.

2. Manual Review

One of the hallmarks of qualitative anonymization is the reliance on manual review. While automated systems can help identify and classify personal data, the ultimate decision whether to redact or retain potentially sensitive information will always be a manual process.

Manual review is particularly important for high-focus sections of clinical trial documents, such as patient narratives, aggregate-level data, or personal contact information. These sections often contain intricate details that may inadvertently lead to re-identification if not properly anonymized. Conducting a detailed review ensures that identifiers are not overlooked and that any retained data is purposefully kept, rather than being missed.

3. Expert Knowledge Redaction

Subject matter experts (SMEs) play a crucial role in qualitative anonymization. These individuals must have a deep understanding of the clinical trial, the study design, and the data in question. Their knowledge allows them to make well-informed decisions about what data to redact, retain, or transform.

SMEs are responsible for ensuring that sensitive data points are handled correctly and that the anonymization process is both effective and compliant with regulatory guidelines. They also help identify high-priority areas that require special attention, such as adverse events or unique medical histories that might pose a higher re-identification risk.

4. Redaction vs. Transformation

A critical decision in the anonymization process is determining when to redact data and when to transform it. Redaction involves completely removing identifiable information, while transformation refers to replacing it with more generalized or abstract categories.

For example, instead of removing all geographical information, researchers might transform "United States" into the broader category of "North America." Similarly, for gender-specific trials, "Female" might be retained in the dataset for clarity.

These decisions are made based on trial-specific factors, such as whether the information has already been publicly disclosed (e.g., on ClinicalTrials.gov), if it is a single-race or single-gender study or how critical the data is for the study’s integrity. The choice between redaction and transformation has a significant impact on the balance between protecting participant privacy and preserving the utility of the data.

Further, the process of anonymizing the data is more complex than straight redaction. Purpose built software solutions may be needed to accomplish this, especially for large projects that may involve anonymization of hundreds and commonly thousands of pages of sensitive participant information.

5. Iteration and Validation

Given the subjectivity and human element involved in qualitative anonymization, it’s vital to approach the process iteratively. This means applying multiple rounds of review and validation to ensure that the anonymization rules are consistently applied and that no sensitive data has been overlooked.

Iteration allows researchers to revisit the rules they initially defined and adjust them based on findings from the manual review process. This ongoing validation ensures that anonymization is effective, while also ensuring consistency across different datasets and study documents.

Defining Anonymization Rules

Once the key considerations are understood, the next step is to define specific rules for anonymization. These rules are not static and may evolve as the trial progresses or as new data becomes available. Researchers often revisit and refine these rules periodically to ensure they remain relevant and effective.

Below is an example of how anonymization rules are applied to specific data categories:

Participant ID: Direct identifiers like participant IDs are pseudonymized. Pseudonymization replaces a real identifier with a code, which allows for linkage across study documents without revealing the participant’s identity. This retains the utility of the data while preventing re-identification.
Contact Details: Personal contact information is typically redacted to ensure that participants cannot be re-identified through their contact details.
Gender: Depending on the study, gender information may either be retained (e.g., in single-gender studies) or redacted in cases where it poses a re-identification risk.
Dates: Rather than redacting dates, suppression techniques may be used. This means replacing specific dates with more general terms like "Date," which maintains context for the reader while protecting the participant’s privacy.
Medical History: Medical history is generally redacted unless it is directly related to the study indication or the adverse event profile. This allows for the retention of clinically relevant data while ensuring participant confidentiality.

Each research team or organization will need to decide what anonymization or redaction rules to apply.

Anonymization of Adverse Events: A High-Priority Consideration

One of the most critical elements in qualitative anonymization is the disclosure and protection of adverse event data. Adverse event data is often prioritized by regulatory bodies, meaning that even in heavily redacted or suppressed datasets, adverse events should be disclosed wherever possible

Regulatory agencies emphasize the importance of adverse event retention because of its impact on understanding the safety profile of a drug. However, qualitative methodologies must strike a careful balance to avoid inadvertently exposing sensitive participant information.

There are two main strategies for dealing with adverse events:

Selective Retention: Researchers can identify rare, sensitive and observable adverse events and review them within the context of the study. If these events are relevant to the drug’s safety profile or the trial indication, they may be retained. Otherwise, they may be anonymized or generalized to a higher-level group term.
Complete Retention: In certain cases, all adverse events are retained, despite the potential risk of re-identification. This approach requires careful consideration, as retaining all adverse event data is likely to increase the risk of participant re-identification.

Contextual Review in Anonymization of Adverse Events

Contextual review is a key component of qualitative anonymization, particularly when it comes to assessing adverse events. The context in which a term appears can determine whether it is retained, generalized, or redacted.

For example, in a diabetes study, an adverse event like "amputation of the left foot" may be retained because it is relevant to the disease being studied. In contrast, in a non-psychiatric trial, a term like "schizophrenia" might be generalized to "psych disorder" if it is unrelated to the study drug or trial indication.

Contextual review allows researchers to make more informed decisions about how to handle specific data points, ensuring that the data remains useful without compromising participant privacy.

Best Practices for Successful Qualitative Anonymization

To ensure the success of a qualitative anonymization strategy, the following best practices should be followed:

Frequent Iteration: Because qualitative anonymization is subjective, multiple rounds of review are essential. This allows researchers to revisit their rules and refine them as needed to ensure consistency and effectiveness.
Expert Involvement: SMEs are crucial to the success of qualitative anonymization. Their knowledge of the trial and its data ensures that anonymization is applied correctly and in compliance with regulatory requirements.
Balancing Redaction and Data Utility: Over-redaction can strip a dataset of its clinical value, while under-redaction can expose participants to re-identification risks. Researchers must carefully balance these competing priorities to ensure that the data remains both secure and useful.
Regulatory Compliance: It’s critical to adhere to regulatory guidelines when applying qualitative anonymization. This includes understanding the requirements of agencies like Health Canada, the FDA, and the European Medicines Agency (EMA), all of which have specific standards for data anonymization.

Conclusion

Qualitative anonymization offers a flexible and adaptable approach to data protection in clinical trials. While it requires more manual effort and subjective judgment than quantitative methods, its flexibility allows researchers to tailor anonymization practices to the unique characteristics of each trial, should they choose to do so.

By following best practices—such as thorough manual reviews, leveraging subject matter expertise, and applying a context-specific approach—researchers can minimize the risk of participant re-identification. The iterative nature of qualitative anonymization ensures that any sensitive information is adequately protected while allowing for adjustments and improvements in the anonymization strategy over time. This is especially important in high-stakes areas like adverse event data, where careful balance is needed between maintaining data integrity and ensuring privacy.

Additionally, a successful qualitative anonymization process must maintain compliance with global regulatory standards, such as those set by the FDA, EMA, or Health Canada. Regular audits, validations, and updates to anonymization protocols help ensure the data remains both compliant and usable for ongoing research efforts.

Qualitative anonymization can support compliance with data protection requirements, however, it often comes with a challenge: preserving data utility while remaining within acceptable risk thresholds. This balance has been known to lead to excessive redaction. Further, understanding the true risk of re-identification is difficult if not impossible as the resulting anonymized data is not statistically assessed. Further, the information loss or resulting data utility is not analyzed which makes the value of the resulting anonymized data unknowable. Quantitative anonymization, on the other hand, results in clear and measurable criteria for achieving a defined risk threshold while providing the highest possible level of data utility. This highlights the significant differences between the two methodologies.

Ultimately, qualitative anonymization can empower researchers to share clinical trial data, contribute to the advancement of science and protect the privacy of participants. By applying thoughtful, context-driven anonymization techniques, clinical trial data can be disseminated more widely, fostering collaboration and driving innovation in medical research without compromising individual confidentiality.

Before selecting an anonymization approach for your clinical data, we recommend understanding the similarities and differences between qualitative and quantitative methodologies such that you can make an informed choice. Real Life Sciences provides comprehensive services and software solutions for both qualitative and quantitative anonymization. For inquiries or to discuss potential projects, please reach out to us at inquiry@rlsciences.com.

Navigating Adverse Event and Medical History Clinical Data Anonymization

In the ever-evolving landscape of clinical trials, managing data responsibly while adhering to regulatory and company defined disclosure requirements is crucial. This post will explore a best-in-class methodology to assess and anonymize Adverse Event (AE) and Medical History (MH) data.

Understanding Adverse Event Assessment

The Objective

At the heart of Adverse Event assessment is the goal to accurately evaluate and report AEs while minimizing the risk of patient re-identification. Achieving this requires a systematic process that incorporates both quantitative and qualitative methods.

A Hybrid Approach

At RLS, we employ a hybrid approach to navigate the complexities of AE assessment. This method integrates quantitative models—which provide metrics, automation, and efficiency—with qualitative insights that derive from domain knowledge and the specific context of the trial and the participant narrative itself. The quantitative component primarily focuses on risk-based anonymization assessments applicable to participant quasi-identifiers, which help us define equivalence classes for the data.

By evaluating similar trials a broader participant population may be applied resulting in equivalence classes smaller than traditional classes. This allows us to confidently retain more data while protecting participant privacy. Other methods include setting a risk of re-identification threshold—say, 9% depending on how the data will be shared. Using assessment capabilities such as reference populations and risk of re-identification thresholds combined with a qualitative review of the participant narratives found in clinical documents, informed decisions regarding retaining the AE data can be concluded.

The Assessment Process

The RLS assessment process unfolds in three key steps:

Automated AE Processing: The first step involves filtering AEs through the RLS internal Rare Sensitive and Observable database. This flags AEs that require further scrutiny, allowing us to focus on those that are truly rare and sensitive - which are most important to evaluate more closely.
Quantitative Assessment: In the second step, we apply the quantitative assessment results to retain AEs occurring 11 or more times, or occurring in classes or groups suggested by the k-sample. The k-sample value is the result of the analysis of quasi-identifiers which creates smaller groupings that adhere to the threshold.
Contextual Review: For AEs that appear fewer than four times, we conduct a detailed contextual review. This step assesses each of the remaining AEs against the specific characteristics of the trial such as trial indication, safety drug profile or re-identifiability if it is observable, knowable or replicable, ensuring that we make informed decisions about anonymization or retention based on clinical relevance.

Through this systematic process, we are able to maximize data utility while the necessary safeguards and risk of re-identification are fully understood before submitting the data for regulatory and third party review.

Maximizing Data Utility

Balancing Re-Identification Risk and Usefulness of the Resulting Data

The overarching goal of our approach is to maximize the data utility of AEs.After all, AEs are a critical component for understanding the trial results and how it may impact secondary research efforts. Even if an event is rare or sensitive, it may still be crucial to the context of the study. For instance, if Schizophrenia appears among two participants as an AE in a trial focused on mental health, we prefer to retain this information due to its relevance, whereas other less pertinent AEs might be generalized to higher level terms.

We often visualize the filtering process as a funnel. Starting with the broad pool of all AEs that have occurred in a trial, we progressively narrow down our focus through rigorous assessment through quantitative and qualitative means, ultimately retaining only those AEs that provide valuable insights without compromising patient privacy outside the desired re-identification risk threshold.

Examples of Contextual Review

To illustrate the significance of contextual review, consider three hypothetical AEs that occur less than four times within a study:

Schizophrenia: If associated with a study on psychiatric disorders, this AE might be retained due to its relevance.
Immunodeficiency Syndrome: If linked to one participant and viewed as potentially stigmatizing, it may be generalized to a higher level term to protect the individual’s identity.
Burns: This could be transformed to a system organ class due to its sensitivity and potentially something that is permanently observable, therefore posing a higher risk of re-identification.

These examples emphasize that our recommended approach is not a one-size-fits-all solution; context plays a significant role in finalizing the determination to retain, generalize or possibly redact.

Anonymization of Medical History Data

While managing AEs is critical, anonymizing Medical History data presents its own set of challenges. It’s essential to evaluate the information based on specific criteria to determine whether it should be retained, redacted, generalized, or suppressed.

Key Considerations

We identified four crucial considerations when anonymizing medical history data:

Entities: Who does the Medical History reference pertain to? Is it participant-level, aggregate-level, or non-participant level (e.g., family history)? The answers to these questions will likely determine the approach taken to redact or retain per entity.
Rule Level: What specific decisions and anonymization strategies apply? When should the standard/global redaction rules vs project specific rules? There needs to be a clear understanding and alignment that project specific rules will take precedence over standard rules. For example, if Nephrolithiasis is a term generally protected but it is present in a Urinary Disorder narrative, hence it will be retained because it is relevant to the context of the patient narrative section. Therefore It’s critical to understand the hierarchy of said rules, especially in cases of conflicting information while considering the trial characteristics.
Conditional Evaluations: Are there specific circumstances affecting how we treat certain Medical History terms? This ensures the approach is flexible and tailored to individual cases.
Standards: Establishing a database of common terms helps us maintain consistency. This ensures that widely recognized terms that pose minimal risk of re-identification are retained.

Practical Applications

In practice, the approach to Medical History anonymization can vary significantly based on trial characteristics. For example:

Participant-Level Information: Generally, we recommend suppressing or redacting most Medical History unless it relates to inclusion criteria or safety profiles of the study.
Aggregate Information: In larger studies, we generally recommend all aggregate Medical History data is retained, but in smaller trials, careful assessments are necessary to avoid re-identification risks.

Contextual Factors

The context of each Medical History term is critical in determining the most appropriate anonymization technique to apply. For example, if Diabetes is mentioned in a context that links it directly to an Adverse Event, it may be retained. However, if it appears without context, it might be redacted to protect against risk of re-identification.

Conclusion

In summary, navigating the complexities of anonymizing and disclosing Adverse Event and Medical History data requires a thoughtful and structured methodology. By applying a hybrid method that integrates quantitative and qualitative assessments, data utility can be maximized while safeguarding patient privacy while adhering to a defined risk threshold. This careful balance is essential in the context of clinical trials where ethical,regulatory and company standards and policies must be met.

As we continue to refine our processes, we aim to contribute to a culture of transparency, trust and participant privacy in clinical research, ensuring that vital data is preserved and utilized responsibly for the benefit of all stakeholders involved.

Clinical Data Anonymization: How to Approach Voluntary Data Sharing

In today's digital age, data is often hailed as the new currency. Individuals and organizations alike generate and collect vast amounts of data every day. With this influx of data comes the opportunity for valuable insights, innovations, and advancements. However, it also raises concerns about privacy, security, and ethical use.

One way to harness the power of data while respecting privacy is through voluntary data sharing. This practice involves individuals or organizations willingly sharing their data for secondary research, analysis, or other purposes. Voluntary data sharing can foster collaboration, drive innovation, and contribute to the greater good. But what if you're not familiar with anonymization techniques? How can you navigate this landscape responsibly? Let's explore some key steps:

Understand the Risks and Benefits

Before diving into voluntary data sharing, it's crucial to understand the risks and benefits involved. Sharing data without proper anonymization can compromise privacy and confidentiality. However, anonymized data can still carry risks, especially when combined with other datasets or advanced re-identification techniques. On the other hand, sharing data can also lead to valuable insights, improved services, and societal benefits. Assessing these trade-offs is essential for making informed decisions.

Educate Yourself

While you may not be familiar with detailed anonymization techniques, you can educate yourself on the key concepts. Learn about common methods and the differences between Qualitative Anonymization & Quantitative anonymization approaches. Understand the principles behind these techniques and their implications for data privacy. Real Life Sciences provides online resources, webinars, and materials to help you grasp the fundamentals of anonymization.

Below are four basic areas of data anonymization that are beneficial know and understand:

1. Removing Direct Identifiers

Removing direct identifiers involves the elimination of explicit personal information from datasets, such as names, addresses, social security numbers, and phone numbers. By stripping away these identifiers, the risk of re-identification is significantly reduced. However, it's essential to recognize that even seemingly innocuous pieces of information can potentially lead to re-identification when combined with other data sources. Therefore, careful consideration must be given to the context and sensitivity of the data being shared.

2. Pseudonymization

Pseudonymization is a technique that replaces identifiable information with artificial identifiers or pseudonyms. Unlike anonymization, which irreversibly removes identifying information, pseudonymization allows data to be linked back to individuals using additional information held separately (e.g., a key or lookup table). While pseudonymization offers a level of privacy protection, it's crucial to implement robust security measures to safeguard the link between pseudonyms and original identities.

3. Generalization

Generalization involves aggregating or summarizing data to a higher level of abstraction, thereby reducing the granularity of the information while preserving its utility for analysis. For example, instead of recording exact ages, data may be grouped into age brackets (e.g., 20-30, 30-40). While generalization helps protect individual privacy by minimizing the risk of re-identification, it can also introduce information loss and reduce the precision of analysis. Therefore, careful consideration must be given to the trade-off between privacy protection and data utility when applying generalization techniques.

4. Data Masking

Data Masking techniques involve altering or obscuring data values to prevent the identification of individuals while preserving statistical properties of the dataset. For example, data may be masked by replacing precise values with ranges or by adding random noise to numerical values. While these techniques help protect privacy, they can also introduce distortions that affect the accuracy and reliability of analysis. Therefore, it's essential to carefully assess the impact of masking on data quality and analytical outcomes.

Seek Expert Guidance

If you're uncertain about anonymization, always seek expert guidance. Reach out to professionals with expertise in data privacy, security, and ethics. They can provide valuable insights, recommend best practices, and help you navigate potential pitfalls. Collaborating with experts can enhance the effectiveness and reliability of your data sharing efforts.

Prioritize Transparency and Consent

Transparency and consent are fundamental principles of ethical data sharing. Clearly communicate the purpose of data sharing, how the data will be used, and any associated risks. Obtain explicit consent from individuals before sharing their data, ensuring they understand and agree to the terms. Transparency and consent build trust, promote accountability, and uphold individuals' rights over their data.

Consider Alternatives

If anonymization seems daunting or impractical, consider alternative approaches to data sharing. For example, you could aggregate data to remove personally identifiable information while still preserving valuable insights. Alternatively, you could establish data sharing agreements that restrict the use of sensitive information and prioritize privacy protection. Exploring different options allows you to find the approach that best balances utility and privacy. However, Data Anonymization is usually the most effective and reliable way to share data.

Embrace Continuous Learning

Data privacy and anonymization are complex and evolving fields. Embrace a mindset of continuous learning and improvement. Stay updated on the latest developments, techniques, and regulations related to data privacy. Engage with the broader transparency community through forums, conferences, and networking events. By staying informed and adaptable, you can effectively navigate the challenges of voluntary data sharing.

Conclusion

Voluntary data sharing offers tremendous potential for driving innovation and societal progress. However, it also poses challenges, particularly for those unfamiliar with anonymization techniques. By understanding the risks and benefits, educating yourself, seeking expert guidance, prioritizing transparency and consent, considering alternatives, and embracing continuous learning, you can navigate voluntary data sharing responsibly and ethically. Together, we can harness the power of data while respecting privacy and empowering individuals.

Clinical Data Sharing: Why It's Important To Share Clinical Results

Clinical data sharing plays a crucial role in advancing scientific knowledge and improving patient care.

Transparency and Scientific Advancement: Data sharing allows researchers to access clinical trial data, fostering transparency and enabling scientific advancements. Pharmaceutical companies have recognized this importance and committed to sharing participant-level data, study-level data, and protocols¹.
Reducing Research Waste: By sharing data, we can avoid duplicative efforts and reduce research waste. Access to existing clinical trial data helps researchers build upon previous work, leading to more efficient resource utilization.
Patient-Centered Care: Data sharing contributes to better decision-making for patients and healthcare professionals. When data is accessible, clinicians can make informed choices based on evidence from a broader pool of studies.
Challenges and Commitment: Implementing good data sharing principles requires resources, time, and commitment. Despite challenges, enhancing data sharing remains essential for scientific collaboration and patient well-being.

This particular Blog will focus on advancing research - how the sharing of participant level data with qualified researchers can accelerate novel research and inform the design of future clinical trials, some examples of successful sharing initiatives and an outline of common challenges associated with data sharing.

Transparency and Scientific Advancement

Transparency and Open Science:
- Clinical trial data, including participant-level data, study protocols, and results, are valuable resources. When researchers share this information openly, it promotes transparency.
- Transparency is essential for scientific advancement. By allowing others to scrutinize and build upon existing research, we accelerate progress.
- Open science practices, such as pre-registering studies and sharing raw data, enhance credibility and reproducibility.
Collaboration and Replication:
- Data sharing encourages collaboration among researchers, institutions, and disciplines. It fosters a sense of community and collective effort.
- Replication studies become feasible when data is accessible. Replicating findings ensures robustness and confirms or challenges initial results.
Pooling Data for Insights:
- Combining data from multiple studies (meta-analysis) provides statistical power. Researchers can draw more accurate conclusions about treatment efficacy, safety, and adverse events.
- For rare diseases or conditions, pooling data across trials is especially valuable.
Challenges and Solutions:
- Challenges include privacy concerns, data governance, and intellectual property. Striking a balance between openness and protecting sensitive information is crucial.
- Initiatives like the FAIR principles (Findable, Accessible, Interoperable, Reusable) guide responsible data sharing.
- The Clinical Research Data Sharing Alliance (CRDSA) is a multi-stakeholder consortium that services the clinical data sharing ecosystem with a shared goal of accelerating the discovery and delivery of life-saving and life-changing therapies to patients by expanding the research value of the data collected through the clinical development process

Some examples of successful clinical data sharing initiatives include:

PhRMA/EFPIA Commitment (2013):
- In 2013, the Pharmaceutical Research and Manufacturers of America (PhRMA) and the European Federation of Pharmaceutical Industries and Associations (EFPIA) endorsed a commitment to share participant-level data, study-level data, and protocols from clinical trials of US and EU registered medicines with qualified researchers. They also aimed to provide public access to clinical study reports (CSRs) or at least synopses from trials submitted to the FDA, EMA, and EU Member States².
Advancements in Drug Development:
- Successful data sharing has led to tools that optimize drug development. These initiatives help bring new therapies to patients in need by accelerating the research process and improving decision-making³.
Clinical Research Data Sharing Alliance (CRDSA):
- The CRDSA focuses on sharing patient data generated from clinical trials. By doing so, they transform the trial process itself, improve the patient experience, and deliver life-saving and life-changing therapies faster and at a lower cost to society⁴.

Data sharing comes with certain challenges that are often faced by researchers.

Cultural Shift
- Challenge: Establishing a culture where data sharing is the norm requires overcoming traditional practices and concerns.
- Recommendation: Stakeholders should foster an environment where data sharing is expected and commit to responsible strategies⁵.
Timelines and Compliance:
- Challenge: Meeting deadlines for sharing various types of clinical trial data (e.g., full analyzable datasets, metadata, and analytic datasets) can be demanding.
- Recommendation: Sponsors and investigators should adhere to specified timelines for data sharing⁶.
Sensitive Data Risks:
- Challenge: Sharing sensitive clinical trial data while protecting privacy and managing risks.
- Recommendation: Implement operational strategies like data use agreements, independent review panels, and transparency⁷.
Organizing and Presenting Data:
- Challenge: Researchers struggle with organizing data in a useful and presentable manner.
- Recommendation: Providing clarity on copyright, licensing, repository options, and metadata standards can help⁸.

Remember, addressing these challenges is essential for advancing scientific knowledge and benefiting patients.

Real Life Sciences specializes in data anonymization for regulatory compliance and voluntary sharing.

Risk-Based Clinical Data Anonymization, EMA Policy 0070, and Health Canada’s PRCI: A Comprehensive Overview

Introduction

In the era of digital health, the importance of clinical data cannot be overstated. However, the need for data privacy and protection is equally paramount. This has led to the development of risk-based clinical data anonymization strategies, regulatory policies like the European Medicines Agency’s (EMA) Clinical Data Publication (CDP) Policy 0070, and initiatives like Health Canada’s Public Release of Clinical Information (PRCI).

Risk-Based Clinical Data Anonymization

Risk-based clinical data anonymization is a strategy that measures the probability of re-identifying individuals (in this case, subjects that have participated in a clinical trial) through indirectly-identifying pieces of information. This probability is then reduced through various data transformations, such as offsetting dates, generalizing disease classifications or demographic values, or removing outlier values. The goal is to balance the need for data utility and the requirement for privacy.

Why EMA and Health Canada Prefer Risk-Based Anonymization

Balance: Risk-based anonymization allows for a more nuanced approach that balances privacy protection with the need for transparency and research. Traditional methods often remove too much data, hindering its usefulness for research purposes.

Minimizing Data Loss: By assessing the risk of re-identification for each data attribute, risk-based approaches can retain more valuable information while still protecting privacy. This allows for more comprehensive analysis and better insights.

Adaptability: The risk of re-identification can vary depending on the context and available information. Risk-based methods can adapt to these changing factors, ensuring appropriate protection in different scenarios.

Compared to other techniques, risk-based anonymization provides a more sophisticated and balanced approach to protecting privacy while enabling valuable research and data sharing in the life sciences industry. This aligns with the goals of regulators like EMA and Health Canada to promote public health and transparency while upholding individual privacy rights.

Both EMA and Health Canada have specific guidelines and regulations outlining their expectations for risk-based anonymization. This ensures consistency and accountability.

Key Considerations of Risk Based Anonymization

Risk Threshold

The risk threshold in the context of clinical data anonymization is defined as the minimum amount of de-identification that must be applied to a dataset for it to be considered de-identified.

In more practical terms, it refers to the probability of correctly assigning an identity to a participant (or clinical trial subject) described in the clinical reports. This is also referred to as the probability of re-identification.

For instance, both the European Medicines Agency (EMA) and Health Canada have set an acceptable probability threshold at 0.09. This means that the likelihood of re-identifying an individual from the anonymized data should be less than 9 in 100 for the data to be considered sufficiently anonymized.

The number of data attributes in the dataset requiring anonymization depends on the dataset’s risk score. Higher risk scores mean more fields must be anonymized. The goal is to ensure that the probability of re-identification is very small, thereby protecting the privacy of individuals while still allowing the data to be useful for research purposes.

The risk threshold in clinical data anonymization is determined based on several factors:

Data Disclosure Precedents and Industry Benchmarks: The risk threshold is often set based on historical data disclosure precedents and industry benchmarks.
Regulatory Guidance: Regulatory authorities such as the European Medicines Agency (EMA) and Health Canada provide guidance on acceptable risk thresholds.
Risk Assessment: Anonymization requires a risk assessment to a predetermined threshold (often 0.09) to determine the probability of re-identification of a clinical trial subject.
Dataset Characteristics: The number of data attributes in the dataset requiring anonymization depends on the dataset’s risk score. Higher risk scores mean more attributes must be anonymized.
Sources of Re-identification Risk: Factors such as the number of participants, whether the trial is in a rare disease, subjective assessment of potential socioeconomic harm to patients if there is re-identification, and the perceived re-identification risk of certain pieces of information (whether they would be knowable by potential adversaries) are considered.

Determining the risk threshold is a complex process that involves considering various factors, including industry standards, regulatory guidance, and the specific characteristics and risks associated with the dataset.

Clinical Data Utility

Data utility in the context of clinical data anonymization refers to the usefulness of the data after it has been anonymized. The goal of risk-based anonymization is to protect the privacy of individuals in a quantifiable manner, but it’s equally important to ensure that the anonymized data remains useful for research purposes.

Preserving data utility during the anonymization process involves quantitative measurements at the document/data level and a well-defined and precise implementation of the selected rules to prevent over-redaction or over-anonymization.

For instance, pseudonymization, which replaces identifiers with a pseudonym, retains more data utility than anonymization, which may involve redacting or masking identifiers. This is because pseudonymization allows for meaningful secondary analyses and follow-on research while maintaining patient confidentiality.

In summary, data utility is a critical aspect of data anonymization. It ensures that the anonymized data can still provide valuable insights and contribute to scientific research, public health, and other secondary purposes.

Methods of Risk Based Clinical Data Anonymization

Clinical data anonymization involves various techniques to ensure the privacy of individuals while maintaining the utility of the data for research purposes. Here are some commonly used methods:

Generalization: Specific values are categorized into groups or ranges. For example, exact ages might be replaced with age groups, and countries might be grouped into continents.
Suppression or Redaction: This involves removing or redacting sensitive attributes entirely.
Masking: Parts of the data are replaced with symbols like (*, $, #).
Date Offsetting: This involves altering an identifiable date related to an individual and applying an alternative or random date throughout the data or document(s). To maintain usefulness of the data, offset dates maintain the same duration between events as compared to the original dates.
Recoding: Categories of a variable are recoded into broader categories.
Local Suppression: Specific values of a variable are suppressed.

These techniques can be used individually or in combination, depending on the specific requirements of the data set and the level of anonymization required.

Evaluating and Selecting a Specialized Risk Based Anonymization Partner

Choosing the right company for risk-based anonymization of clinical data is crucial, as it requires balancing utility with robust privacy protection. These are the principles Real Life Sciences is built upon. Here are some key considerations:

Expertise and experience:

Specific experience with clinical data: Look for companies with proven experience in handling sensitive clinical information and understanding its complexities. Familiarity with relevant regulations such as CDP/Policy 0070 and PRCI are essential.
Track record of anonymization methods: Evaluate their expertise in applying various anonymization techniques and their suitability for your specific data and goals.
Understanding of risk assessment: Ensure they have a solid understanding of risk-based approaches and can tailor the anonymization strategy to your specific risk tolerance and data utility needs.

Technology and infrastructure:

Security and compliance: Verify their security measures meet industry standards and regulatory requirements. Look for certifications like ISO 27001 for information security management and a QMS for comprehensive quality processes.
Data anonymization tools and algorithms: Assess the robustness and effectiveness of their chosen anonymization tools and algorithms.
Data handling and processing: Understand their data storage, access controls, and destruction procedures.Ensure they align with your security and privacy policies.

Company reputation and ethics:

Customer testimonials and references: Seek feedback from past customers, especially those in the life sciences industry about their experience and satisfaction.
Transparency and communication: Evaluate their willingness to discuss their approach, answer questions, and address your concerns openly and honestly.
Ethical considerations: Confirm their commitment to ethical data handling practices and alignment with data privacy principles.

Conclusion

Risk-based clinical data anonymization, EMA Policy 0070, and Health Canada’s PRCI are all significant strides towards a future where clinical data is both accessible and secure. These initiatives not only foster transparency and trust but also pave the way for innovation and advancement in clinical research.

While these initiatives are a step in the right direction, it is crucial to continue refining these strategies to ensure the balance between data accessibility and privacy is maintained. As we move forward, the focus should be on developing robust, scalable, and efficient methods for data anonymization and public release, keeping in mind the ever-evolving landscape of digital health and data privacy regulations.

When implementing a risk based anonymization approach, engage with experts, like Real Life Sciences for assistance. This will accelerate your project and increase your probability of a high quality and on time project.

Unlocking the Secrets: Navigating Company Confidential Information (CCI) in Clinical Trial Transparency

In the realm of clinical trials, transparency is not just a buzzword but a fundamental principle guiding the advancement of medical science. However, amidst the push for transparency across regulators such as Health Canada, the FDA and EMA, one significant challenge looms large: handling company confidential information (known also as confidential business information (CBI) when referring to Health Canada).

What exactly constitutes company confidential information in the context of clinical trials, why is it so challenging to manage, and what potential solutions exist to strike a balance between transparency and protecting sensitive data?

Defining Company Confidential Information

Before delving into the complexities, it's crucial to establish what falls under the umbrella of company confidential information in clinical trials. Simply put, it encompasses proprietary data that clinical trial sponsors consider integral to maintaining a competitive edge. This could include:

Intellectual Property: Details of proprietary technologies, formulations, or processes integral to drug development.
Financial Information: Data pertaining to budgets, expenditures, and revenue projections.
Clinical Data: Preliminary findings that are yet to be disclosed to regulatory bodies or the public.
Trade Secrets: Any information giving a company a competitive advantage and not publicly known.
Commercial Strategies: Marketing plans, distribution channels, or pricing strategies crucial for market positioning.

For further clarification, EMA defines CCI as:

“...any information contained in the clinical trial information submitted to the CTIS which is not in the public domain, or publicly available, and where disclosure may undermine the legitimate economic interest or competitive position of the owner of the information”.

Examples of What is NOT Considered Confidential Information:

Information already in the public domain including sponsor website, registries, FDA, and / or scientific literature
General information such as unit of measure where the value may be CCI but the measure is not e.g.; 3.2mL → mL
Countries where the study was conducted

The Tightrope Walk: Why It's Challenging

Balancing the imperative of transparency with the need to protect company confidential information is akin to walking a tightrope. Several factors contribute to the complexity of this task:

Regulatory Compliance: Regulatory bodies like Health Canada, FDA and EMA advocate for increased transparency in clinical trials. However, they also recognize the importance of safeguarding proprietary information.
Competitive Edge: Trial Sponsors invest billions in research and development. Disclosing sensitive data prematurely could undercut their competitive advantage and impede innovation.
Legal Ramifications: Unauthorized disclosure of confidential information could lead to legal repercussions, including breaches of contract or intellectual property disputes.
Data Security Concerns: With the rise in cyber threats, ensuring the security of sensitive data presents a formidable challenge.
Public Trust: Striking a balance between transparency and confidentiality is crucial for maintaining public trust. Overemphasis on secrecy can breed skepticism, while excessive transparency may compromise commercial interests.

Charting a Course: Potential Solutions

While navigating the labyrinth of confidentiality in clinical trials may seem daunting, several strategies can help reconcile competing interests:

Clear Guidelines: Establishing clear guidelines delineating what constitutes confidential information and how it should be handled fosters transparency while mitigating risks.
Transparency Protocols: Implementing robust transparency protocols ensures that only non-confidential data is disclosed to the public, protecting sensitive information.
Anonymization Techniques: Employing anonymization techniques such as risk-based anonymization of patient data allows for the release of trial results without compromising confidentiality.
Secure Data Infrastructure: Investing in secure data infrastructure fortified with encryption, access controls, and regular audits bolsters protection against unauthorized access.
Collaborative Approach: Encouraging collaboration between internal stakeholders, partners and vendors, and advocacy groups facilitates dialogue on transparency standards while addressing concerns about confidentiality.

Practical CCI Strategies & Considerations

Things to Know

Before you redact, Sponsors should be aware of the information already available in the public domain for their product’s development
The extent of the redaction should be limited only to the word(s), figure(s) and pieces of text that can be considered CCI
For CTR projects, EMA has indicated you may indicate CCI in non-public version documents for purposes of Member State awareness (e.g.; highlight or apply a red box around the perimeter of CCI text)
For EMA Policy 0070 projects, preparing a robust justification for your CCI redactions is critical to avoid unnecessary rejections. While justifications are not required for CTR public version documents, we recommend you create an internal justification log and store it in case it is needed in response to an Request for Information (RFI).

Limiting Company Confidential Information

Limit unnecessary sharing of confidential information during the authoring process; often referred to as “minimization”
Involve Real Life Sciences who retains relevant and practical experience with scientific and technical skills in the CCI identification process
Follow a consistent decision-making process and tag references to CCI during authoring

A Suggested Three Step Process for Handling CCI

Rule out information in the public domain
Confirm the information is innovative and could undermine the economic interest of the business
As an exception, determine if the information is not deemed to be innovative but could still undermine the economic interest or competitive position of the business

Use Technology to Streamline your CCI Identification and Redaction Process

A purpose built enterprise-ready system, such as RLS Protect, is designed to support your clinical document redaction process from end-to-end. Example capabilities include, but are not limited to

Efficiency and Quality

Single system to track the work tasks and redact the documents
Automated searching for key entities (e.g.; dates, IDs, names, email addresses, CCI, BMI, Race, Gender…)
Customized and configured document and project specific searching. Ability to re-use saved searches across projects and documents
Capture CCI decisions to apply to future documents
Share visibility to project and document status
Version control
Redaction reporting per document
Review, comment and revision

Collaboration & Visibility

Alerting and notification
Task assignment to individual user(s)
Commenting
CCI justifications
Review and approval workflow

Security

Permissions and roles
Auditing
Data encryption
Sanitize (removal of all document metadata)

Conclusion

In the quest for transparency in clinical trials, navigating the labyrinth of company confidential information is a formidable challenge. Yet, by adopting a balanced approach that safeguards sensitive data while promoting openness, stakeholders can forge a path toward greater transparency without compromising innovation or commercial interests. As the landscape evolves, maintaining a nuanced understanding of confidentiality will be paramount in unlocking the secrets to advancing medical science for the benefit of all.

The EMA Policy 0070 and Health Canada Public Release of Clinical Information (PRCI) Anonymization Report

How the New Anonymization Report Template Impacts Study Sponsors

Collaboration between Health Canada & EMA

While Health Canada PRCI and EMA Policy 0070 are distinct policies, Health Canada and EMA collaborate on harmonizing approaches to clinical trial transparency. In most circumstances, both recommend a 9% re-identification risk threshold for disclosure submissions and provide a unified Anonymization Report Template. Further, both share a similar scope for submission with focus on ICH CTD/eCTD M2.5, M2.7 and M5.3.

What is the Anonymization Report Template?

The Anonymization Report Template is a crucial document used in these disclosure submissions. . The report captures the methods you applied to anonymize the clinical data and assessment of risk of re-identification applicable for your submission.

Thoroughly completing and thoughtfully justifying your anonymization methods in the report is crucial for a smooth and successful EMA Policy 0070 submission. If you have any specific questions about the template or the anonymization process, consider contacting Real Life Sciences for more information.

Here's a breakdown of the key aspects of the anonymization template (key terms defined below):

Structure and Content:

The template follows a structured format with specific sections to address. Note the descriptions of the requested information for these sections are not exhaustive. Further, this template is now offered for both EMA Policy 0070 and Health Canada’s PRCI project submissions.

Methodology: Describes the specific anonymization methodology or strategy employed for the data within the submission package.
Identification of Data Variables: Indicate the direct and indirect identifiers found within the package. For each, indicate what anonymization technique was used such as generalization, redaction, recoding, offset, retained, other.
Risk Assessment: Indicate initial risk of reidentification value, the risk threshold you have applied and the residual risk value. The risk of reidentification evaluates the likelihood of individuals being re-identified based on the applied anonymization methods. This will involve a risk-based assessment of your data, as can be performed by Real Life Sciences.
Data Utility: List variables with the highest data utility and describe how their utility was maintained
Deviations: Indicate if there are any deviations in your approach for the CDP package as compared to the stated guidance provided by EMA.
Attestation: Confirm the accuracy of the information provided and that the anonymization techniques were applied consistently across all documents in the final package submission.

Importance and Benefits:

The Anonymization Report plays a critical role in demonstrating compliance with EMA Policy 0070's data protection requirements.
It allows the EMA to assess the effectiveness of your anonymization efforts and ensure appropriate protection of participant privacy.
A well-prepared report can expedite the data publication process by addressing potential concerns upfront and minimizing the risk of delays or rejection.

Key terms

Risk of Reidentification

In the context of clinical trial transparency, the risk of re-identification refers to the probability of an adversary successfully identifying a specific individual who participated in the trial, even when their data has been anonymized or de-identified. This re-identification could potentially happen through a combination of factors, such as:

Unique data points: An individual's medical history, demographic information (e.g., age, race, gender), genetic markers, or even geographic location could be sufficiently unique to allow them to be singled out even in a large dataset.
External data linkage: Anonymized data from the clinical trial could be linked with other datasets, such as public records or social media, which contain additional information about individuals that could lead to their re-identification.
Statistical techniques: Advanced statistical techniques, such as machine learning algorithms, could be used to analyze the anonymized data and identify patterns that allow for the identification of specific individuals

Risk Threshold

Risk Threshold is the maximum acceptable probability of an individual being correctly re-identified from anonymized clinical trial data. It's a crucial measure for balancing data privacy and transparency.

Residual Risk

Residual risk is the remaining risk that individuals could be re-identified from a dataset, even after anonymization techniques have been applied. It's the risk that lingers even after safeguards are in place.

Data Utility

Data utility is the usefulness and accuracy of the anonymized data for its intended purpose. It essentially describes how well the anonymized dataset retains its value for analysis and insights, while still protecting the privacy of individuals.

Why a New Template?

In designing the new Anonymization Report Template (launched in 2023), EMA and Health Canada aimed to achieve the following benefits:

Improved predictability for Sponsors through use of a structured form
Improved writing efficiency for Sponsors
Structured drop down options provide improved quality and consistency
One consistent template applicable for EMA Policy 0070 and Health Canada PRCI submission projects
For regulators, the new template will provide additional efficiency through use of a standard and structured template
For readers, portal users and consumers of the anonymized information, the new template is more concise and will be easier to understand; it helps to clearly identify the most pertinent an useful information

Contact Real Life Sciences:

If you have questions or would like some consultation prior to starting your EMA Policy 0070 or Health Canada PRCI project, please contact us directly at inquiry@rlsciences.com.

What To Know Before Embarking On Your EMA Policy 0070 Submission Project

Making the Most of the European Medicines Agency (EMA) Policy 0070 Pre-Submission Meeting

Policy 0070 Background

EMA Policy 0070 (otherwise known as Clinical Data Publication, or CDP), is a policy implemented by the European Medicines Agency (EMA) that governs the publication of clinical data for medicinal products for human use. The policy aims to increase transparency and public access to clinical data while protecting patient confidentiality and commercially sensitive information.

History of CDP/Policy 0070

2014 Policy adopted
2015 Phase 1 implementation
2018 Policy on-hold due to Brexit and COVID (except COVID treatments)
2023 Relaunch initiates in September
2024 Anticipated relaunch scope expansion
Future Phase 2 implementation (date TBD)

Key Policy 0070 Terms

Committee for Medicinal Products for Human Use (CHMP): The scientific committee of the EMA that provides opinions on marketing authorization applications.

Marketing authorization application (MAA): The formal application submitted by a pharmaceutical company to obtain marketing authorization for a medicinal product.

Marketing authorization holder (MAH): The individual or company responsible for the marketing of a medicinal product.

Pre-submission Meeting (PSM): The meeting held between Sponsor and EMA to review the scope, strategy and timescales for the Policy 0070 submission.

See our Policy 0070 FAQ page for more background info and key terms.

What is the EMA Policy 0070 Pre-Submission meeting?

The EMA Policy 0070 pre-submission meeting is a critical opportunity for sponsors to discuss their plans for complying with the policy with the European Medicines Agency (EMA) before formally submitting their Proposal Package (PP).

The pre-submission meeting will confirm in-scope and out-of-scope content, anonymization strategy, use of vendor, CCI (if applicable) and submission schedule. This meeting can help to avoid delays and ensure a smooth submission process. It is especially helpful and valuable for sponsors submitting a Policy 0070 package for the first time.

Where in the MAA process does the Pre-Submission meeting occur?

The pre-submission meeting occurs prior to submitting the Proposal Package (PP). The value of the meeting comes from aligning early with the Health Authority on the personal data anonymization strategy and approach to protection of confidential information.

Ideally, your pre-submission meeting should occur no later than when you receive the invitation letter from EMA. You can (and should) begin your Policy 0070 project work prior to the invitation letter. Please refer to the timeline illustration below for more details.

Reviewing and confirming your anonymization strategy with EMA early in the process allows you to incorporate their feedback into your Proposal Package, thereby streamlining the validation process and potentially preventing future delays or rework.

Figure 1F

The Pre-Submission Meeting

Here's what you need to know:

Purpose of the Pre-Submission meeting

Facilitate open discussion with the EMA on your intended clinical data publication (CDP) submission.
Identify potential issues or concerns early on to avoid delays or rejection.
Get feedback on your proposal document package, including your anonymization approach and strategy.
Discuss scope details and timelines.
Establish an early dialogue and relationship with the EMA.

Who should attend?

Representatives from your company familiar with the submission and anonymization process, including regulatory affairs, scientific experts, and data anonymization specialists.
Real Life Sciences, who will support you in the preparation of content for the meeting, including the recommended anonymization strategy for your project.
Representatives from the EMA's Clinical Data Publication Team (CDP Team).

What to prepare

A list of specific questions you have for the EMA.
A few slides summarizing your progress on the CDP process.
Your Anonymization strategy.
Your listing of Commercially Confidential Information.

Key Topics for Discussion

Submission scope and schedule.
Your recommended anonymization technique/strategy.
Scope and justification for Commercially Confidential Information.
Timeline and milestones for publication.

Benefits of a Pre-Submission Meeting

Increased clarity and confidence in your CDP submission.
Reduced risk of delays or rejection due to non-compliance or technical issues.
Improved communication and collaboration with the EMA.
Streamlined publication process leading to faster access to clinical data.

Additional Preparation Items Before Embarking on Your Policy 0070 Project

Assemble your team including Clinical Transparency, BioStats, Regulatory
Engage with Real Life Sciences for expert advice, preparation for and facilitation of the pre-submission Meeting and delivery of the complete proposal and final packages
Gather the in-scope Policy 0070 CTD document set - specifically ICH CTD/eCTD M2.5, M2.7 and M5
Gather the applicable trial datasets which can be used in support of an empirical analysis of subject data and later applied to the in-scope documents by RLS
Review the EMA guidance for Policy 0070
Plan an internal kickoff for your project with help from RLS

Policy 0070 Moving Forward

During the Policy 0070 relaunch in 2023 EMA and Health Canada demonstrated increased partnership which is expected to continue moving forward. This is evidenced, for example, through the jointly revised Anonymization Report Templated to be used by Sponsors when submitting their Policy 0070 (and Health Canada PRCI) Proposal Packages. This collaboration between the two Health Authorities benefits Sponsors given the similarities in the two policies.

The revised Anonymization Report Template requires trial sponsors to indicate which type of anonymization methodology they have adopted for the submission, among other important points pertinent to the project. If trial sponsors do not indicate a quantitative risk-based methodology they must include justification as to why. A quantitative anonymization methodology, otherwise referred to as “risk-based anonymization” applies empirical measures to assess the risk of reidentification of the resulting anonymized data - based on analysis of the subject data itself.

In contrast, a qualitative or “rules based anonymization” approach applies a static set of anonymization rules to the data, however, the results are not quantifiable and therefore the risk of r identification is not measured. As the Anonymization Report Template has evolved, including this most recent set of revisions as published Fall 2023, it is becoming increasingly clear the Health Authorities have a strong preference to a Quantitative risk-based anonymization methodology. While RLS supports both quantitative and qualitative approaches, our experience suggests most Sponsors prefer quantitative methods because it is measurable (the risk of reidentification is known) and it supports maximizing the clinical utility of the resulting data.

Given EMA and Health Canada appear to prefer a risk-based anonymization method when evaluating Policy 0070 submissions, and because of the inherent benefits of a risk-based anonymization approach, RLS recommends sponsors adopt this methodology. That said, there may be certain specific situations where a rules based or qualitative methodology is best. Most certainly this underscores the importance of the pre-submission meeting. As mentioned above, the trial sponsor should discuss with EMA the preferred approach during the pre-submission meeting. Having the methodology agreed with EMA no later than the pre-submission will help the trial sponsor avoid delays when delivering the Proposal and Final packages to EMA.

Contact Real Life Sciences

If you have questions or would like some consultation prior to starting your EMA Policy 0070, CTR or Health Canada PRCI project, please contact us directly at inquiry@rlsciences.com. RLS has completed dozens of CDP/Policy 0070, Health Canada PRCI and other clinical transparency projects with pharmaceutical manufacturers of all types and sizes.

For more information on Policy 0070 please visit our Policy 0070 overview page.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.