The primary difference between a conventional and a continuously available infrastructure is their approach to downtime. The
prevailing mindset in conventional corporate infrastructures focuses on recovery from errors and failures, an approach known
as (return to operations, or RTO). Recovery-oriented solutions assume downtime, even if it's only a few minutes during failover
one server to another. IT configurations that depend on recovery are typically not appropriate for pharmaceutical EBR and
Continuous availability's focus, however, is on symptom detection and error prevention, which aligns nicely with recent risk-based
guidance by US and European regulators and supported by leading industry associations like the International Society for Pharmaceutical
Engineering.10,11,12 Continuously available infrastructures are built with redundancy and error detection that prevents failure and supports a
Six-Sigma, quality management system (QMS) approach to the IT environment that aligns with mandates from regulatory agencies
in a cost-effective and efficient manner. True high availability requires looking at the entire infrastructure from both design
and operational perspectives.
Selecting the proper technology is a good place to start building an ultra-high reliability infrastructure. A comparison between
fault-tolerant computing and clusters illustrates the difference between recovery-oriented and prevention-oriented solutions,
and why the latter is better suited to pharmaceutical manufacturing.
FAULT-TOLERANT COMPUTING FOR ELECTRONIC RECORD SYSTEMS
Server clusters are what most often comes to mind when a company considers a high availability solution for supporting EBR
or PAT systems. In clusters, pairs of servers linked by clustering software operate as a primary and a backup. If the primary
server fails, the software shifts the processing load to the backup server.
The different flavors of various clustering technologies suffer from common weaknesses: complexity, cost, and unproven reliability.
To meet pharma manufacturing's QMS requirements, each "computer system" must be properly revision controlled and validated.
Clustering software requires duplicate control records and custom scripting. This leads to requirements for ongoing operational
qualification (OQ) demands to test the cluster's functionality, in addition to the simple installation qualification and maintenance
required for typical hardware components such as a storage system or network device.13 Additionally, clusters require two servers, increasing management demand and cost. Finally, clusters run on enterprise-class
versions of operating systems, not the relatively inexpensive versions. The cost of the additional compliance activities described
earlier adds up quickly. Finally, a software cluster is a configured system, not one that a company can plug in and run. Configuration
can be expensive and laborious.
As clusters' weaknesses have surfaced, pharma companies have begun exploring fault tolerant servers that are essentially two
servers operating in lockstep inside a single chassis. Until recently, fault-tolerant computers have been available only in
the realm of "big iron"— highly funded corporate data centers that have the budgets and staffs to run large legacy systems.
In the last few years, however, the market has seen the emergence of cost-effective, fault-tolerant computers that run the
commercially available operating systems used in pharma manufacturing and other lower-cost environments. Such infrastructure
components can be purchased off the shelf with built-in, factory-tested high availability features. Like their network and
storage counterparts, they only require IQ, significantly simplifying qualification and maintenance. Furthermore, only one
version of the operating system is required. Some lower-end versions even run the less expensive server operating system configurations
(as opposed to the enterprise versions). This form of fault tolerance is ideally suited for the regulatory, cost, and management
burden needs in pharma manufacturing.
Consider the example of a major US-based pharmaceutical company that decided to overhaul its IT infrastructure by implementing
an EBR system to replace its paper-based one. This company learned that first-hand conventional reliability solutions didn't
meet the pharma manufacturing standard.
The company decided that an EBR system would streamline regulatory compliance, eliminate the overhead expense of keeping paper
records, and improve production efficiency. For the system to deliver these benefits, however, it needed to provide an uninterrupted
stream of data. During proof-of-concept tests with conventional networking technology, the company's IT staff soon discovered
the EBR system's reliability limitations. Routine issues like application and operating system crashes, reboots, and scheduled
downtime all but erased the EBR system's benefits. The tests also uncovered reliability problems in unexpected places. Server
and driver hardware weren't solid enough to resist errors and downtime. Third-party device drivers, which provided interfaces
to server peripherals and communications lines, caused the operating system to crash.