Submitting Advanced Bioanalytical Test Methods for Regulatory Approval

Published on: 
BioPharm International, BioPharm International-09-15-2005, Volume 2005 Supplement, Issue 3

FDA and regulatory agencies worldwide have recently approved many advanced bioanalytical technologies. Receiving approval of advanced test methods for new biopharmaceutical products is relatively straightforward, provided clinical and process validation data are generated by the same (or at least similar) test method. However, regulatory approval becomes more difficult and time consuming when compendial test methods or test methods for already licensed biopharmaceuticals are changed.

FDA and regulatory agencies worldwide have recently approved many advanced bioanalytical technologies. Receiving approval of advanced test methods for new biopharmaceutical products is relatively straightforward, provided clinical and process validation data are generated by the same (or at least similar) test method. However, regulatory approval becomes more difficult and time consuming when compendial test methods or test methods for already licensed biopharmaceuticals are changed. The two most important regulatory submission aspects are the actual analytical method validation (AMV) results and protocol acceptance criteria, and evidence of method comparability of the new method versus the current one. This overview will mostly address the demonstration of method comparability. The critical elements for method development, validation, and transfer activities, including the upcoming regulatory expectations for standard-to-sample curve parallelism for the validation of bioassays were discussed in detail elsewhere.1-4

Current GMP expectations are that new test methods must bring an overall improvement to measuring process and product quality. From a regulatory perspective, "improved" means that new test methods should perform equal to or better than classical methods (compendial or otherwise officially recognized) or approved methods (licensed) in regards to their critical method performance characteristics. Although an incentive for the firm, many method improvements such as cost-savings in reagents and automation do not necessarily demonstrate a sufficient improvement to the regulatory authorities. An equal or improved performance of the new or candidate test method versus the current one should be demonstrated by method comparability studies concurrent or after completion of the formal AMV studies. The method comparability studies could be included as part of the formal AMV protocol or simply be executed under a separate protocol after the AMV was completed. From a regulatory perspective, both are acceptable as long as all data was generated under a formal protocol with pre-specified acceptance criteria. Performing a separate method comparability study after AMV completion has several advantages. If the AMV results reveal that a method was not optimized, time and effort will be saved by holding off the comparability studies until the new method is ready. Also, the potential differences between results generated by both methods could be estimated and used to support the studies before the formal method comparability is initiated. However, pre-specified method comparability acceptance criteria should not be derived only from this knowledge. The candidate method's acceptability should be compared to the current method with respect to historical process control data in relation to the relevant release specifications. Other sources of method comparability acceptance criteria such as being within an acceptable difference that is/was shown to be clinically insignificant may also be justifiable.

Improvement Criteria

Although many new analytical methods provide faster, more accurate and precise results and somewhat support one of the principles of Process Analytical Technology (PAT) (i.e., immediate test results), current guidelines still lack detailed guidance on how to provide evidence that these new methods are equal to or better than classical or already licensed methods. To provide this evidence, we must define what "equivalent or better" means, and which aspect of all test method performance characteristics should be compared. Once we can clearly define this, we can then demonstrate test method equivalence or superiority. Per International Conference on Harmonization (ICH) Q2A/B guidelines,5-6 the five test method categories can be grouped into two greater categories — qualitative and quantitative test methods. A qualitative test method provides qualitative results (pass/fail, yes/no, or results reported as less than some action or specification level), whereas a quantitative test method provides results reported in real numbers ( Table 1). By definition, qualitative test methods must not be accurate or precise, but they must be specific for the analyte tested and often require the determination of the detection limit (DL). It is critical for qualitative methods to provide high percentages of positive results for positive samples, and high percentages of negative results for negative samples. For qualitative limit tests, a low DL is clearly desirable as it also increases the likelihood for observing positive results even at low analyte concentrations. Quantitative test methods require the generation of data for accuracy, precision, and several other criteria, depending on the type of release specifications and the relevant validation acceptance criteria to be met, in order to be valid and suitable for use (See Table 1).4

Table 1. Validation Characteristics Per ICH Q2A/B and Relevant Product Specifications

Qualitative Versus Quantitative Method Comparability Studies

Table 2 identifies the validation characteristics and statistical tests a method comparability study should include and provides suggestions on which statistical tests may be appropriate to use for each validation characteristic. All qualitative tests should contain a comparison of hit-to-miss ratios (usually at low analyte concentrations) between both methods. This will somewhat describe the level of specificity of both methods. Both hit-to-miss ratios can be compared using Chi-squared statistics. If the DL is a required characteristic, then both limits should be compared. There is no easy approach to statistically compare DLs unless both DLs were established by the recommended ICH Q2B approach using linear regression statistics. If the candidate's method provides a lower DL, it could be stated in the submission. If the DL is higher, an explanation and justification with respect to release specifications and process control data should be provided.

Table 2. Suggested Statistics to Assess Method Comparability for Each Required Validation Characteristic

For all quantitative methods, the method performance characteristics accuracy and precision (intermediate precision) should be compared. Assuming that both methods were properly validated individually, it is a regulatory concern whether results will be expected to change overall by drifting (change in "accuracy") or by a potential increase in day-to-day variance ("intermediate precision"). Depending on a pre-specified allowable difference, a drift in results may require a change in release specifications and would require regulatory pre-approval before the new method can be used for release testing. The demonstration of a method's accuracy will require an evaluation of equivalence between results obtained by both methods. A potential drift in release results can occur in lower or higher results. In most cases, both directions are not acceptable outcomes, and testing for equivalence between methods for accuracy is required.


Current Regulatory Guidance Documents

Unfortunately, there is currently no clear guidance available on how to demonstrate the equivalence of the candidate method. Several guidance documents contain general instructions that are helpful but do not provide sufficient detail on which characteristic to compare and how to compare them. Also, the scope and intention of some guidelines were not designed for the purpose of comparing a candidate method's performance to the current one. However, the following documents provide some guidance for method comparison and/or use of statistical tests:

ISPE, Good Practice Guide on Technology Transfer, 2003.

ICH E9, Statistical Principles for Clinical Trials, London, March 1998.

CPMP, Points to Consider on Switching Between Superiority and Non-Inferiority, London, 27 July 2000.

CPMP, Points to Consider on the Choice of Non-Inferiority Margin, London, 26 February 2004)

PDA's Technical Report No. 33., Evaluation, Validation and Implementation of New Microbiological Testing Methods.

ASTM International's D4855-97, Standard Practice for Comparing Methods.

All of the guidance documents listed can be used to establish a firm's standard procedure for method comparability studies. These and possibly other general guidelines constitute good base references if followed as written, and provide a good regulatory submission tool.

Demonstrating Non-Inferiority, Equivalence, and Superiority

Of all guidance documents described, the ICH E9 and CPMP guidelines provide the most detailed instructions on how to conduct comparison studies. Three categories of comparisons are distinguished into tests for non-inferiority, equivalence, and superiority. The three categories are described in detail and graphically illustrated in CPMP's Points to Consider on Switching Between Superiority and Non-Inferiority, and only briefly summarized here. When using the CPMP guideline and one of the categories within, two important points must be discussed and justified in the regulatory submission.

  • The chosen comparison category must be explained and justified. A method comparison protocol should provide the design of experiments to be done in a formal study and which statistical comparison test will be used, as well as a pre-specified value for the allowable difference in results (described by the following text).

  • The pre-specified maximum allowable difference (or delta) must be clearly set in the protocol, but finding and justifying an appropriate delta is the most difficult part of any comparison study. Some guidance is provided in CPMP's Points to Consider on the Choice of Non-Inferiority Margin, London 26 February 2004, however, a clear strategy and good examples are missing. Delta should be derived as to strike a balance between being too large (easy to pass but observed potential difference may be too large) and being too small (difficult to pass but observed statistical difference may be overall acceptable). Delta should be derived similar to acceptance criteria for AMV protocols that are release specifications and historical process control data should be related to delta.

Method Comparison Study Examples

The comparison tests of non-inferiority, superiority, and equivalence are described and justified in the following examples for test method changes.

Figure 1. Results for the Non-Inferiority Test: Candidate Method vs EP/USP Sterility

Demonstrating Non-inferiority

A faster and technologically advanced method for sterility testing was validated and compared to the compendial EP/USP sterility test. The non-inferiority comparison at the 95% confidence level (p=0.05) was chosen with a pre-specified delta of –10% versus the compendial (current) method. A non-inferiority test with a delta of –10% was justified. Non-inferiority, equivalence, and superiority are all acceptable outcomes, and the increased testing frequency of daily (n=7 per week) for the new sterility versus twice weekly (n=2 per week) for the EP/USP Sterility test significantly increases the likelihood of detecting organisms with the new method. The statistical results are given in Figure 1, and a graphical presentation of the 95% confidence interval is shown in Figure 2. The 95% confidence level in Figure 2 includes 0 (no difference) and lies entirely to the right of the pre-specified delta of –10%. The comparison results indicate that the candidate method is not inferior to the EP/USP sterility test method.

Figure 2. Graphical Illustration of Non-Inferiority Results for Candidate Test vs EP/USP Sterility Test

Demonstrating Superiority

When the relative testing frequency of our example under non-inferiority of n=7 (new method) versus n=2 (EP/USP method) for the compendial method is integrated in our comparison studies, the superiority of the new method could be demonstrated. A summary of the statistical results (at 95% confidence) using the data from Figure 1 is given in Figure 3. Superiority at the 95% confidence level could be demonstrated because the new method's 95% confidence interval (0.9997 – 1.0000) for the positive-to-fail probability (0.9999) lies entirely to the right of the 95% confidence interval (0.9205 – 0.9665) of the compendial method's positive-to-fail probability (0.9472). Therefore, the likelihood of observing potential microbial growth is significantly increased when the sampling frequency is integrated.

Figure 3. Results for the Superiority Test: New Method (7x per Week) vs. EP/USP Sterility (2x per Week)

Given the results when different testing frequencies were considered, we could have easily demonstrated the superiority of the new method versus non-inferiority with a difficult-to-justify, pre-specified delta. At the end, the superiority test was passed with a much greater relative margin than the non-inferiority test. This is a good example why we should always first consider an upfront comparison study to select and defend our strategy in regulatory submission.

Demonstrating Equivalence

A characterized impurity of a licensed biopharmaceutical product is currently quantitated by SDS-PAGE at the final container stage. Because of anticipated supply problems for critical SDS-PAGE materials, it was decided to develop and validate a capillary zone electrophoresis (CZE) method that will replace the current (licensed) method. Validation characteristics for a quantitative limit test, accuracy and intermediate precision are compared before replacing the licensed method with the validated CZE method. The AMV results of both methods suggest similar performance for intermediate precision but also a drift in the percent impurity results (only the accuracy or "matching a reference" comparison is shown here). From the analysis of historical release data with respect to the current release specifications (for SDS-PAGE), a delta of _1.0% was chosen for the equivalence test between both impurity levels. Both methods were run simultaneously (side-by-side) for each of a total of n=30 final container samples, and the results were compared by two-sided, paired, t-test statistics with a pre-specified delta of -1.0% (percent is reported percent, not relative percent). The paired t-test results are summarized in Figure 4.

Figure 4. Equivalence Test Results Comparing SDS-PAGE (Reference) to CZE

The 95% confidence interval of the CZE method (4.88 – 5.32) lies entirely over the SDS-PAGE mean (3.8%) plus the positive delta (3.8% + 1.0% = 4.8%). This means that the CZE results for our impurity are not only significantly higher than our licensed SDS-PAGE method, but also that the expected drift in results is significantly higher than our pre-specified limit that was based on the gap of our historical release results relative to the release specifications. Release specifications would need to be changed for the CZE method to use this new method for release testing. This would not only be expected by the regulatory authorities when submitting this change in test methods. It would also be a business need, as we would otherwise significantly increase the expected number of specification-failing lots.

Submitting Validation and Comparability Results to the Regulatory Agencies

After the successful completion of the AMV and comparability studies for a new method, results can be submitted to the agencies where needed. If the particular biopharmaceutical is licensed abroad, analytical method transfer (AMT) results should also be submitted for each region that requires lot-release testing within that region. The submission categories and expected approval timelines are somewhat similar in the US and Europe. Other regulatory agencies for countries like Canada and Japan may take significantly longer for approval. It is therefore crucial to plan properly and to strategically submit changes, keeping in mind different timelines in different regions. The submission categories and estimated approval times are illustrated for the US and Europe in Table 3. If we were to submit our new CZE assay (see example under equivalence studies), we would also submit a request for a release specification change. This submission would be submitted as a pre-approval submission (PAS) in the US and a Type II variation in Europe. Because of the costs, time, and effort involved particular with new assays, submissions of all other assay changes could be bundled together or submitted with other significant changes in production process or product itself. However, excessive or abusive bundling of independent changes may lead to regulatory requests to submit independent changes separately.

Table 3. General Guidance for Regulatory Submission Categories for US and Europe

Stephan O. Krause, Ph.D., is validation manager of QC assay support, Bayer Healthcare LLC, 800 Dwight Way, Berkeley, CA 94701-1986, 510.705.4191, fax: 510.705.5143,


1. Krause, SO. Qualifying Release Laboratories in Europe and the United States. BioPharm International 2004; 17 (3):28-36.

2. Krause, SO. Development and validation of analytical methods for biopharmaceuticals, part I: development and optimization. BioPharm International 2004; 17(10):52-61.

3. Krause, SO. Development and validation of analytical methods for biopharmaceuticals, part II: formal validation. BioPharm International 2004; 17(11):46-52.

4. Krause, SO. Analytical Method Validation for Biopharmaceuticals, BioPharm International, A Guide to Validation 2005; 24-32.

5. ICH. Validation of Analytical Procedures. Q2A. Federal Register 1995; 60.

6. ICH. Validation of Analytical Procedures: methodology. Q2B. Federal Register 1996; 62.