Assessing Data Integrity Risks in an R&D Environment

Published on: 
, , , , , ,
BioPharm International, BioPharm International-10-01-2020, Volume 33, Issue 10
Pages: 40–44,49

A data-integrity risk assessment tool has been developed for use with standalone R&D data-acquisition and processing software.

In 2014 through 2016, inspections of pharmaceutical facilities revealed a pattern of repeated failure to follow data integrity requirements established in the current good manufacturing practices’ (CGMP’s) predicate rules and regulations in 21 US Code of Federal Regulations (CFR) 211 (1) and other international regulations. Global regulators took a number of enforcement actions against manufacturers. Since that time, data integrity, particularly for electronic data, has become an area of increased regulatory focus in the pharmaceutical industry.

In March 2018, the UK’s Medicines & Healthcare Products Regulatory Agency (MHRA) issued guidance and definitions on this topic (2), and FDA published guidance the following December, in the form of Questions and Answers (3). These documents followed guidances issued by the World Health Organization (4) and the Pharmaceutical Inspection Convention/Pharmaceutical Inspection Co-operation Scheme (PIC/PICS) (5).

A working group of the International Consortium for Innovation and Quality in Pharmaceutical Development (IQ) has developed a risk-assessment tool to help professionals working in pharmaceutical research and development assess data-integrity risks associated with stand alone (i.e., non-enterprise) computerized analytical data acquisition and processing software. It is meant to be used with systems that generate data kept in non-transient data storage. It would not, for example, be used for transient data storage (e.g., for non-computerized pH meters and the like). Its developers expect this tool to help harmonize the process for performing data-integrity risk assessments across the pharmaceutical R&D analytical community.

They also hope that the new risk-assessment tool will promote standardization and harmonization, allowing instrument and software vendors who sell into the pharmaceutical R&D analytical marketplace to better understand the sector’s expectations and to meet the needs of their customers. This article will describe the tool in greater depth.

Emphasis on a risk-based approach

Recent regulatory guidances, particularly those from FDA and MHRA, place considerable emphasis on the implementation of risk-based approaches to ensuring data integrity. For example, the FDA guidance explicitly notes that CGMP regulations and guidance allow for “flexible and risk-based strategies” to prevent and detect failures to ensure data integrity. In similar spirit, the MHRA guidance describes a risk-based approach to data management that includes the assessment of “data risk, criticality, and lifecycle.”

The MHRA guidance spells out how to perform a risk assessment specific to a particular data acquisition and processing system, suggesting that all processes that produce data or from which data are obtained be mapped out so that each of the formats is identified. The objective is to identify controls, the criticality of the data, and the inherent risks involved with each of them

Assessing the risk of lack of data integrity is seen as driving compliance and, if necessary, remediation. For example, while audit trail review is often considered an essential part of ensuring data integrity, the same guidance clarifies that risk assessment should determine a documented audit trail review as part of routine
data review.

Given the complexity of audit trails for any electronic system, practitioners generally prefer to implement technical controls that will reduce or eliminate the need for audit trail review. Generally, however, when technically feasible, the goal must be to prevent a data-integrity failure.

In cases where this is not technically feasible, data collection and review is required to demonstrate that data integrity has been maintained. In rare cases in which an action can neither be prevented nor detected, a procedural control may (in some limited cases) be appropriate.

Any step taken to mitigate a data integrity risk should be assessed to determine its appropriateness in the context of the criticality of the gap between desired and actual practice. The MHRA defines critical risks as those that would potentially allow data or metadata to be “deleted, amended or excluded without authorization.” FDA’s guidance echoes this emphasis, highlighting the importance of data integrity throughout the CGMP data life cycle, from creation, through modification, processing, maintenance, archival, and on through retrieval, transmission, and disposition after the period required for data retention ends.

Discussion of the tool

The risk assessment tool is divided into a number of sections, each of which is briefly discussed below.

System control and access
This section contains a set of questions that deal with access controls for users, administrators, and others. It describes expectations related to the implementation of different access levels and varying permissions. In situations where it is necessary to assign system administrator access to lab personnel, appropriate justification should be provided (e.g., ensuring that technical and procedural controls are in place to prevent the modification or deletion of files and settings). Time and date stamps and time zones are also discussed, as are steps that can be taken for the prevention of unauthorized access.

Data protection, controls, and regulatory compliance
The focus of the data protection, controls, and compliance section is on restricting the ability to delete or modify data. It also defines and discusses true copies of data and encourages the definition of the complete data (raw file, process file, etc.), including storage location, system configuration, and data flow.

Audit trails, metadata, and data review
The questions in this section focus on the completeness and appropriateness of audit trails, and the required review of data and metadata. FDA defines an audit trail as being a secure, computer-generated, and time-stamped electronic record that allows the course of events leading to a data-integrity failure (e.g., the creation, modification, or deletion of an electronic record) to be documented and analyzed.

FDA guidance uses a high-performance liquid chromatography (HPLC) run as an example, stating that the audit trail should include user name, date/time of the run, the integration parameters used, and any reprocessing details. It also states that the system audit trail, including system administrator actions, should be periodically reviewed.

Archival, retrieval, back-up, disaster recovery, and contingency plans
The questions in this section focus on ensuring that electronic data are enduring, complete, and secure from modification or loss. The electronic data (including associated metadata) should be in the original format or in a format that is compatible with the original format.

Electronic signatures
The questions in this section focus on the use (where necessary) of electronic signatures, which should ensure attributability to the specific person who signed the document electronically.

Guidance on using the template


The risk-assessment tool consists of a summary, followed by a tabular presentation as depicted in Figure 1. The entire tool is available on the IQ Consortium web site (6). The expected responses to be filled into these sections are described below. The summary section should contain an overall evaluation of the acceptability (from a data integrity standpoint) of the system under study. This evaluation should take into account any mitigations and interim actions identified in the risk assessment.

In addition, users may find it convenient to describe (in general terms) the configuration of the system, including the data storage location and any data safeguards that are associated with it (e.g., if the data are written to a drive to which the user does not possess rights to modify or delete files). Other mitigations that apply to multiple lines in the assessment may also be conveniently discussed in the summary section and the summary section referenced in the affected lines.

Description of the columns

The first column contains a list of data-integrity requirements, each phrased as a question for which “yes” is the desired response. Options include yes, no, and not applicable (NA). These questions are designed to prompt a thorough investigation of the system at hand. As mentioned, the questions are phrased such that the desired response is “yes.” It is important to note that the assessment should be performed against the system (including the hardware, software, and any governing procedures) as it currently exists. If the system does not meet the expectation described in the question, a response of “no” should be selected.

In rare instances, a response of N/A may be appropriate. For example, one question asks whether libraries are included in the data backup. For software that does not utilize a library search, this question would properly
be answered N/A.

The Guidance column provides additional context for, and clarification of the questions. In some cases, it also provides suggestions on potential mitigations.

Comments and mitigation of risk
The final column can be used to describe the manner in which the system meets the requirement (in the case of a “yes” answer). When the “no” response is selected, it should be used to describe the manner in which the risk therein identified is mitigated.

There also may be the scenario where a “no” response is selected with a rationale for why the risk is considered acceptable. Finally, in the case of an N/A response, the context which renders the question not applicable should be explained.

Next steps, other potential uses

The IQ working group is publishing this risk assessment tool to facilitate data-integrity risk assessments and promote standardization and harmonization of practice. However, its developers believe that it may be useful in other applications as well. 

For instance, pharmaceutical firms may find it helpful to use the tool (or a subset of it) when evaluating instrument/software systems to aid in purchasing decisions. Similarly, instrument vendors may find it useful in understanding the needs and expectations of the pharmaceutical industry.

The tool may also be useful in managing third-party laboratories and vendors who service equipment and in drafting technical quality agreements with them. Finally, it is expected that implementation of the tool will aid pharmaceutical firms in internal harmonization and benchmarking, across sites within a company.


The authors wish to acknowledge the contributions of the IQ Working Group for Data Integrity in R&D Analytical Laboratories.


1. US Code of Federal Regulations, Title 21, Food and Drugs (Government Printing Office, Washington, DC), Part 211

2. MHRA, GXP Data Integrity Guidance and Definitions (London, UK, March 2018).

3. FDA, Data Integrity and Compliance with Drug CGMP–Questions and Answers (Rockville, MD, December 2018).

4. World Health Organization, Guideline on Data Integrity (Geneva, Switzerland, October 2019).

5. Pharmaceutical Inspection Convention Pharmaceutical Inspection Co-operation Scheme, Good Practices for Data Management and Integrity in Regulated GMP/GDP Environments (Geneva, Switzerland, November 2018).

6. IQ Consortium, Data-Integrity Tool,

About the authors

Julie Lippke*, and Joseph Mongillo both work in analytical research and development at Pfizer Inc. (Groton, CT); Thomas Cullen works in analytical research and development and Chase Waller works in quality, both at AbbVie Inc. (North Chicago, IL); Katria Harasewych and Zahid Muhammad both work in quality at Merck & Co., Inc., (West Point, PA); John Bennett works in quality at Boehringer Ingelheim Pharmaceuticals Inc. (Ridgefield, CT).

*To whom all correspondence should be addressed

Article details

BioPharm International
Vol. 33, No. 10
October 2020
Pages: 40–44,49


When referring to this article, please cite it as J. Lippke, et al., “Assessing Data Integrity Risks in an R&D Environment,” BioPharm International 33 (10) 2020.