In this first in a two-part article the authors describe the software development of an FDA-compliant Class III medical software application that was successfully ported from a Windows CE environment to an Android 4.1 tablet platform. The authors also discuss the lessons learned about how to integrate off-the-shelf components without affecting design goals.
Part 1 describes the specifics of the case study, the software architectural challenges and the automated verification framework used to complete the project. Part 2 addresses regulatory standards, development process, and outsourcing strategy utilized to complete this project within the timeline and organizational constraints.
Smartphone and Tablet usage is growing at an exponential rate. The hardware capability and available sensors on these mobile devices enable use cases that were previously unimaginable. Consumers are responding accordingly and adopting new technologies faster than ever. By 2017, the number of people in the world having access to a smartphone is expected to be over 3 billion.
This trend forces other industries to adopt these mobile technologies at an equivalent pace just to meet expectations of the users. The medical device industry is especially affected by this.
The case study – a soft real-time safety-critical medical application
Implantable defibrillators are devices that are implanted into the human patient. This device continually monitors the heart’s electrical rhythm for potentially life-threatening rhythms that may cause death. When such a rhythm is detected, the device provides shock therapy that may save the patient’s life (Figure 1 ). This implanted device can be accessed and monitored through a wireless connection with a mobile programming device typically called the Programmer.
The legacy platform in this case study was a C++ based WinCE Netbook that was equipped with Bluetooth and a 16-bit PCMCIA custom hardware telemetry radio card. The new target platform is a tablet (Figure 2 ) with Android 4.1 software based on Java with a USB-based radio card. The wireless protocol used between the Programmer and implanted device is a custom protocol that conforms to the Medical Implant Communication Service (MICS) specification. The goal of the port was to transfer the same functionality onto the Android platform (the following areas were unchanged – wireless communication protocol, implant device side functionality, business logic state machine).
The software in this case study is deemed Class III, which implies the highest level of criticality because a defect could cause patient death. Hence, compliance to a variety of external standards is required, the details of which will be discussed in the second part of this article. The conditions and constraints under which this legacy Windows project operated included (exact numbers are not given due to confidentiality reasons):
- ~5000 requirements
- ~10000 test cases
- ~40 features
- ~200k LOC of legacy Windows CE C++ code
- ~5 in-house developers
- Complete localization for five languages.
- No in-house experience in Android and Java
To fully understand the challenges of porting the design from a legacy Windows CE design and at the same time transforming it into an Android-based platform that met the requirements for Class III software, it is important to be cognizant of a few technical differences between this Class III medical Android application and other applications on Android. Two important differences to note at this point are the following:
- This application cannot be killed, suspended, or deprioritized relative to other applications or services. The application needs to be continually active so there is no room for memory leaks or anomalies that may lead to a crash. We implemented a lightweight GUI and memory manager to achieve this. The application has complete autonomy and there is no way for the user to leave the application once the tablet is turned on. The user interface (UI) needs to follow FDA standards and the legacy system that users are familiar with. Hence the UI had to be customized instead of using the native Android interface elements.
- Relating to power management, since the wireless radio interface between the Programmer and the implantable device is custom, special hardware needs to be permanently attached to the 30-pin connector on the tablet. This hardware must be continually active and hence requires specific custom power management software.
Developing the application
Since this is a soft real-time system that works with a wireless embedded system and is used globally, there are multiple requirements that need to be considered. The list below captures the various dimensions of the requirements within this system. Note that each feature may contain a mix of one or more of these types including:
- Functional requirements (Figure 4 below) capture the screen flow and are normally specified in terms of flow diagrams. Captures all possible scenarios of user interaction and reaction to the telemetry commands and the data to/from the implantable device.
- GUI requirements (Figure 5 below) define the user observable screen layout, text and images.
- Telemetry requirements define the sequence of commands from the Programmer and the expected reactions from the implanted device based on those commands. The exception paths are also detailed in these requirements, e.g. the screen that is presented when expected telemetry commands are not received or timed out.
- Performance and safety requirements refer to the timing requirements, critical error handling and application stability. For example, the application must be able to run continually for over 4 days without reboot.
- Printer requirements define the content and layout of the printed reports.
- Multilingual requirements define the translations of all user observable text to the five supported languages.
- Session requirements define the attributes of a session (captures the activities the user performs) that are stored and maintained on the non-volatile memory (SD card).
Figure 3: A Screen from the final Android application indicating implanted device information (real-time ECG in the bottom, programmable parameters in the middle, and implantable device status indicator and feature links in the top ribbons)
The following productivity enhancement and early defect identification tools were utilized:
Git – governs the change management of all verification and development code assets.
Code Pro Analytics – a static analysis plugin within Eclipse (development environment) that analyzes the Java code against various best coding practices.
Implantable Device Simulator – an in-house developed application that simulates all implantable device states and telemetry commands and responses. It allowed all developers to test their code as if it were interacting with a real device. It is a key tool to allow for simulation of state and error conditions that were difficult or impossible to duplicate with a real device.
Emma – a utility within Eclipse that allows the developer to measure code/branch coverage of the unit tests.
Robotium – an Android library that is the basis of the test automation framework. This library provides for APIs that enable UI interaction and the execution of automated test scripts.
Bugzilla – the bug tracking system used between the IH and OS teams to track all defects and their lifecycles.
Test Management Suite – a suite of in-house scripts stitched together around a Microsoft Access database and MS Excel data. All the test cases, test procedures, and traceability between requirements and test cases are captured within the database. The scripts allow the generation of various reports of interest.
Automated Build System (ant, Jenkins) – automates the nightly code build
Given the set of requirements, the next challenge was to establish a robust architecture. Since we were moving to the Android 4.1 OS, we had an opportunity to redesign and optimize the application architecture. Below were the key technical architectural drivers:
Temporal determinism – This implies that the architecture must ensure that functionality is met per specification under all possible conditions and that temporal requirements are met.
Reliability and availability – The system must be operational for long periods of continuous usage without any restart or crashes (long MTTF) Also, the Programmer must reboot upon a failure quickly (short MTBF).
Fail safe – Due to the safety critical nature of the application, failures must be dealt in a fail safe manner, i.e. upon failure, any loss of functionality will not result in any user or patient harm.
Extensibility -The ability to easily extend the design as requirements are changed or added.
Portability – The ability to move the Programmer software to another hardware platform and/or operating system version with minimum porting effort.
Reusability – The ability to reuse the core framework design in future programmer platforms.
Testability – The ability to automate all the forms of testing.
Challenges in maintaining real-time and secure behavior on Android
Thissystem is a soft real-time system. For example, the live ECG(electrocardiogram) drawing requires data to be streamed from theimplanted device and then rendered onto the display in real time.Critical commands from the tablet to the implanted device must beprocessed within a deterministic deadline. In addition, the applicationmust be secure from interruption or corruption or unintended use by theuser, e.g. the user must not be able to install unauthorizedapplications or switch to other applications during run time. In orderto guarantee these requirements, a few key architectural changes wereperformed:
- The removal of non-relevant Android applications that could take up processing time; disabling of any hardware features not required by the application. This would ensure singular control of the tablet by this application upon power on. This required special authority from the table OEM via the use of the enterprise Standard Development Kit.
- Minimizing the garbage collection execution time.
- The default Android GUI design pattern is to use multiple activities and multiple views. However, this approach incurs significant heap and context switching overhead as each activity is a separate thread. Rather, we went with a single-activity, multiple-view design pattern.
The subsections below give more details on these three changes:
GUI Framework. Thestandard Android GUI framework provides a rich environment for thedevelopment of mobile apps. For example, the framework provides standardevent handling and lifecycle management for how the applicationinteracts with external events such as phone calls and othernotifications. However, this framework comes at the cost of performanceoverhead. Specifically, each view is a separate activity, which implies aseparate thread and heap. Figure 7 illustrates the differencesbetween the standard Android framework (left) and the framework weemployed (right). Since the case study application did not requirehandling of most of these external events, we opted for a lighter weightapproach. The key concept of our framework is that all views arepre-allocated during initialization and simply activated or deactivateddepending on which screen is requested.
Operating system control strategy. As part of developing for a class III medical environment, it wasnecessary to provide a high degree of control for the operatingenvironment outside of the primary software application. This wasaccomplished by replacing the OEM supplied Android OS shell with acustom shell that disabled or removed all unnecessary software andservices at boot time. The custom OS shell also managed hardware accessfor peripherals such as Wi-Fi, Bluetooth, and cellular networks,ensuring that no external connection could be made other than via theprimary software application. Finally, the shell provided a controlledpath for software updates while preventing the user from installing acustom ROM or rooting the tablet.
Figure 7: Standard multiple view,multiple activity Android GUI framework vs. case study’s light-weightGUI single activity multiple view framework
Memory management. The Android garbage collector is a systems service that can be invokedat any point in the execution. Although in a general-purpose applicationthis may be beneficial to avoid an overall system degradation due toone application, in this case study we needed to ensure timingguarantees were met in all conditions. To do this we needed to minimizeany dynamic object creation and destruction. The solution was to employan object pool design pattern where objects that are regularly generated(e.g. GUI event objects, ECG objects, and communication system messageobjects) are reused through a managed object pool. This minimized thedefragmentation of memory and the pileup of unreferenced objects thatwould require additional effort for the garbage collector.
OEM partnership. Anotherkey to the success of the software architecture was the development of astrong partnership with the tablet vendor. The tablet vendor providedAPIs and customization support to ensure a locked-down environment aswell as support for integration of the USB based radio card. The idea ofensuring a controlled execution environment is a major differentiatorbetween Class III medical or other regulated industries and commercialsoftware development, and this degree of control would not have beenpossible without a working partnership with the OEM.
Automated verification framework
Theverification of Class III medical software is a challenging task. Testautomation is critical to ensure a quick verification of softwarechanges and to make testing repeatable. However, in order to buildautomated verification tests a robust automated verification frameworkis necessary. This section will describe our automated verificationframework for Android and some of the strategies employed to verifyspecific requirement types. A single run of the entire automated requirements-based testsuite took 16 machine days or two calendar days with eight concurrentmachines and two managing engineers.
Below are the categories of verification performed on the system:
- Requirements based tests verify that the specific requirement is met under success and failure conditions. There is one or more test cases for every unique requirement tag. (Mostly automated)
- Integration tests verify how different features work with each other. For example, scanning and connecting to the implantable device and then disconnection. This tests both the scan for device and connecting to device features. (Mostly automated)
- Stress tests verify the behavior of the system under worst case scenarios. For example, scanning and connecting to the device and then disconnecting back to back repeatedly for 10 hours. This tests the system stability over long periods without a reboot. (Totally automated)
- Unit test verifies the class code implementation prior to code review. Depending on the risk level of the feature, the statement and branch coverage could be required to be 100%. (Totally automated)
- System tests verify whether the system (implantable device and Programmer) works within the intended environment. (Manual)
- Exploratory testing is typically done manually by a domain expert. The intent of the test is to attempt to break the system by going outside of the normal use cases. (Manual)
Design for automated verification
Sincethis project was a port onto a new platform while maintaining similarrequirements, the test case descriptions and manual test proceduredescriptions could be ported with minor modifications. The degree oftest coverage for all test categories was, at a minimum, maintained atthe same level as the legacy platform but improved where required.Figure 10 illustrates a legacy manual test procedure that verifies aset of requirements. This test procedure was then automated bydeveloping a Java test script using the automated test framework.
Theautomation framework was developed in order to ensure the maximumdegree of automation to verify all the requirement categories detailedin the earlier section. For requirements where the cost of automationwas deemed too large (e.g. automating the visual inspection of the realtime ECG graph would be very complex), a semi-automated approach thatincluded operator input during the segments where manual verificationwas required.
Figures 8 and 9 illustrate the automated verification framework.
Validation of the automation framework
Thecorrectness of the test execution results depend heavily on thecorrectness of the automation framework. Hence it is critical that theframework be validated to the same degree as the riskiest component thatis being verified – e.g. if a safety critical component of a high risklevel is being verified by the automation framework, then all thecomponents within the automation framework that is utilized for theverification of that component needs to be validated per the sameverification process as required per the process documents (which wewill discuss later).
However, the FDA guidelines do recommend aleast burdensome approach to verification. For example, note that theframework illustrated in Figure 9 contains a mix of in-housedeveloped utilities such as the telemetry message verification (sequenceverifier) and off-the-shelf components such as adb and Robotium Solo.Depending on the risk level, off-the-shelf components can be verifiedthrough documenting the degree of industry usage and the evaluation ofthe published defect list of that software – a mature software withsignificant industry usage would have a low likelihood of major defects.Another strategy to minimize verification burden is to look foropportunities to verify the component downstream. For example, if everyscreenshot is verified manually through visual inspection, then the toolused to capture the screenshot can be deemed to be validated downstreamas part of the screenshot verification protocol.
Figure 9: Test host and Android Programmer under test automation setup (semi-automation component marked in red)
Ineither approach, it is valuable to document the set of components, therisk level of the component, the verification strategy and the reasoningbehind the strategy choice.
Modularized test automation. Refactoring and modularizing the test scripts in a way that maximizesthe reuse of common elements is a critical element in ensuring thatfuture changes can be accommodated efficiently. Otherwise, a single GUImodification could cause a large number of scripts to fail in thescenario wherein GUI modification changes a central GUI element that isused by many of the scripts. The Common DVT Library component in Figure 8shows the reusable library of commonly used test scripts. For example, alarge number of the test procedures require a connection to the deviceprior to performing the specific verification. Hence, connecting to thedevice would be a library element that can be invoked by other scriptsthat are in other features.
Text execution and reporting. Oneof the goals of the framework was to have a seamless environment wherethe verification engineer can develop the test scripts and launch thetest scripts on the Programmer under test, either as part of a batch oftests or as a standalone test. Figure 12 illustrates the completeenvironment where the test scripts are installed on the set ofProgrammers under test and then the test results are collated andsummarized in a set of HTML pages. The tests can be scheduled andpartitioned across multiple devices. Since the whole environment(including the connection to the real or virtual/simulated device) isall in a wireless environment, the setup is easy to deploy. We alsomaintained a history of all the test executions for trending purposes.
Figure11: Automated test environment with multiple Programmers under testconcurrently. Note that the virtual device is a software tool installedon the tester PC that simulates the implantable device.
Telemetry message verification. Thesequence of telemetry messages between the implantable device and theProgrammer are predominantly deterministic. Commands from the Programmerare processed and acknowledged in series. Furthermore, multiple uniquecommand sequences may have the same GUI response. Hence it becamecritical to verify the sequence of telemetry messages and not just thescreen flow. Figure 12 illustrates the concept. The devicesimulator captures the sequence of telemetry events that aresent/received from the perspective of the device simulator.
Click on image to enlarge.
Theacceptance criteria is derived through manual inspection of thetelemetry requirements or through log file extraction of the successfulexecution of the test procedure in question. For thetest procedure to pass, the event sequence derived from the testprocedure execution must match the event sequence defined by theacceptance criteria.
This article is the first of a two-part series that describes the development of a Class III medical application on Android. We discussed the case study requirements, software architecture, and automated verification framework. There are many opportunities for improvement ranging from using a more integrated tool chain that links all types of requirements to development and verification assets, to integrating a cellular connectivity into the software architecture. The second part of this series will discuss a development process that complies with the regulatory requirements of a Class III medical device, and the outsourcing strategy that was utilized to complete this project within the budget and timing constraints.
Read Part 2: Building Class III medical software apps on an Android platform: Part 2 – Developing an FDA compliant framework
Sri Kanajan is a software engineering consultant with 11 years of experience withsafety critical embedded systems both in the automotive and medicaldevice fields. He has 18 publications and three best paper awards inthese categories. He has a MS in electrical engineering from Universityof Michigan and a MS in software engineering from Carnegie Mellon. Hecan be contacted at
Shrirang Khare iscurrently working with Persistent Systems Ltd. as software architect.He did his bachelors in science (electronics) followed by post graduatestudies in computer management. He has more than 12 years’ experienceworking in the wireless embedded and telecom industry developingmiddleware and application software. He specializes in working acrossthe entire software stack for mobile phones, tablets, and other embeddedplatforms and is experienced in working with tier one handsetmanufactures and leading ISVs.
Richard Jackson has been asoftware and systems engineer for over 20 years, specializing in realtime, mobile, and safety critical systems. He has worked for highprofile companies such as Microsoft, IBM, and Boston Scientific, and hasspoken at numerous industry conferences.
1. Image of Implanted S-ICD System
2. Object pool design pattern
3. FDA General Principles of Software Validation
4. Scrum Description
5. Smartphone and Tablet Penetration
6. Robotium , Android test automation framework