In an exclusive interview with Asia Business Outlook Magazine, Nagalakshmi Shetty, Vice President & India Country Head, ICON shares how changes in data management have changed the way clinical research is practiced and also emphasizes the role of interdisciplinary collaborations in progressing research. She boasts of more than twenty years of experience in the field of clinical data management.
Within clinical research, what has traditionally been referred to as data management is now more commonly referred to as clinical data science. What prompted this change?
Moving from traditional clinical data management (CDM) to clinical data science (CDS)has been a fascinating evolution for our industry. Traditionally, data management has been about collecting, reviewing and cleaning data so that it can be sent to the statistics team for analysis. However, advances in technology have enabled us to go beyond traditional data review activities to more advanced analytics and risk reviews.CDS teams can quickly visualize and interact with enormous volumes of operational, clinical, and safety data within a single interface. This enables them to identify signals using statistical methodologies and support forward-looking decisions rather than retrospectively tracking results. Early risk detection allows us to manage safety anomalies proactively and prevent their recurrence, thus enabling data quality.
This is critical to both patient safety and the reliability of trial data. Moving beyond traditional data management is imperative, especially with the increase in trial design complexity and other challenges associated with the five Vs of big data (volume, velocity, variety, veracity, and value of data). Using methodologies like robotic process automation (RPA), natural language processing (NLP)and machine learning (ML) has further automated repetitive and complex tasks. These processes allow for faster data review, ongoing quality assessments, and process improvements focused on the patient, ensuring optimal trial outcomes.
Integrating diverse clinical data sources (EHRs, trials, patient-generated) is complex. What are the main technical challenges in merging these for clinical research, and how can they be effectively addressed?
A key issue with the various and disparate data sources available is interoperability and having third-party systems capable of integrating with different infrastructures. Typically, a company will have data architecture that allows for automated ingestion of data. However, the company will depend on third-party systems providing data with appropriate APIs and connectors to allow this data to be effectively utilized. In addition to having appropriate connectors, it is critical that external systems give access to both the clinical data and the associated metadata. This allows for audit trail review, custom key risk indicators (KRI) creation and the development of on-the-fly analytics.
Once data is flowing in, the next challenge is curation. This is not just about collecting large amounts of data in various file formats and data structures but also about taking this data and standardizing it. This enables the data to be leveraged by downstream systems in a normalized fashion, which is key for the delivery of reusable analytics and outputs. The challenge of curation is particularly relevant for wearables, as well as needing to handle the large file sizes, teams have to determine how the raw data will be used. Wearable data is often voluminous, and the raw format is unwieldy. planning is needed to ensure it is curated to allow for true value generation. However, this also means maintaining a data “chain of custody”, showing the data’s journey from raw to curated to analysis.
Varying data quality impacts research reliability. How do these discrepancies affect data science outcomes, and how can inconsistencies be addressed in clinical research?
For the Clinical Data Science team, the primary objective is to ensure analysis-ready clinical data while upholding the highest standards of data integrity and regulatory compliance. Alongside this, the team works towards ensuring the trial is run in the most efficient manner at optimal speed and cost. ICH GCP E6 (R2), gives guidance on quality by design principles and readiness of risk-based quality management systems (RBQM).
Data quality should be built into every aspect, from the trial design to its execution. Clinical record form (CRF) design captures all data required for analysis. There should be a clear plan on how the data is cleaned within the EDC via edit checks and outside the system using programmed smart listings. Additional review of primary and secondary end points can be added as required. Various functional groups like clinical, medical and biostatistics review these specifications. Data quality audits are performed, including quality reviews on sample queries and validation outputs. Analysis of quality outcomes is key to taking necessary corrective and preventive actions.
In ICON, we have Clinical Data Risk Analysts to identify site risks and uncover data trends and patterns, thereby ensuring quality, patient safety and integrity of the clinical trial data. They perform various data analytical reviews, including KRI reviews, quality tolerance limit (QTL) reviews, and exploratory reviews using data visualization platforms. This allows early identification of unusual data patterns, inconsistencies within the data, recurring data issues or trends that may indicate safety concerns and/or data quality issues. All of this ensures that appropriate mitigations are in place to correct the issues for each site and prevent other sites from ever having an issue. Managing risk with data analysis proactively and on a real-time basis has proven to be very effective.
Integrating advanced analytics with clinical research poses hurdles. How does this integration create challenges, and what best practices help overcome them effectively?
There is tremendous scope in the field of advanced analytics, especially using artificial intelligence (AI) and machine learning (ML). Site selection is one of the key aspects of a clinical trial’s success. ICON has vast data on sites, their capabilities and performance. Using AI, we process this data to deliver valuable insights on the most suitable investigators and sites with the highest potential for patient enrolment based on the trial requirements. This enables sound decision-making and minimizes risk to ensure the success of the trial. However, human involvement in the process is essential. For example, India as a clinical trial location is still not leveraged to its full potential. So, while India may not show up as a recommended site due to relatively minimal data on its potential to offer strong clinical trial sites, the team takes other factors into consideration before making their final recommendations regarding site choice. While AI helps us with advanced analytics, it is not meant to replace humans.
The five V’s are important considerations to enable analytics. We have a huge volume of data with devices, ePROs, a variety of data including imaging and videos with multiple structures, sharing frequency and methodologies. Using cloud-based scalable technology, products and solutions, helps in handling these dimensions to make it consumable for analysis.
Effective interdisciplinary collaborations are crucial in clinical data science. How do these collaborations help overcome challenges, and what factors foster successful partnerships among data scientists, clinicians, and researchers?
Collaboration between interdisciplinary teams is crucial for the success of any trial. Effective collaboration leverages medical, clinical, regulatory and analytical expertise, including operational knowledge. It matters in every stage of clinical data science, from inputs for eCRF design and edit check development to meeting key project milestones. An effective communication plan established early on provides the foundation for cultivating effective collaboration. The use of systems that have inbuilt workflows, trackers, reminders etc, facilitates better outcomes.
Our Integrated Data Review Plan is a good example of effective collaboration. This centralized system enables efficient planning and coordination of data review activity across interdisciplinary teams. It captures data reviews performed across functions, removing overlap, gaps, and duplication in the process. This ensures quality and clear visibility to the client while promoting continual process improvement across studies.
Ultimately, it’s important that collaboration is baked into an organization’s processes. It is extremely satisfying when a successful database lock is achieved as the culmination of the shared vision of all the interdisciplinary teams.