Abstract - safety

Traditionally, safety-critical systems isolate the safety-related functions ideally into a simple node exclusively covering a minimal and simple functionality. Such safe computing nodes traditionally run on "simple" single-core processors and use a minimum software set [^1]. Contemporary single core CPUs are no longer simple and the growing complexity of systems, e.g. including network security requirements, complex control algorithms and even cognitive functions for autonomy raise the complexity beyond what small and simple single core CPUs can handle. This traditional approach to functional safety is changing as nicely expressed by NASA procedural requirements for safety related software:

"This Standard does not discourage the use of software in safety-critical systems. When designed and implemented correctly, software is often the first, and sometimes the best, hazard detection and prevention mechanism in the system." [NASA NPR 8719.13B 1.2]

The changes noted above coincide with two significant developments of the past decade impacting the design of safety-related systems

  • broad introduction of multi-core CPUs
  • significant change in the development dynamics

Staying at "simple" single-core computers would come at the price of de-coupling from the main-stream chip and computer-science development and that, in the long run, may induce more risks than it will mitigate. In this overview we give an outline of how the SIL2LinuxMP project is aproaching the qualifiation/certification of the Linux RTOS

SIL2LinuxMP Concept

The goal of SIL2LinuxMP is to qualify GNU/Linux RTOS and its support environment (e.g. development tools) for Safety Integrity Level 2 (SIL2) [2] according to IEC 61508 Ed2 (2010).

The key technology for this project is the GNU/Linux RTOS as the OS for a multi-core platform. Safety certification of GNU/Linux systems have been achieved in the context of specific industrial projects like rail interlocking systems or fire & safety systems but not as a generic qualifiable component, which is the goal of SIL2LinuxMP. To enable GNU/Linux RTOS for the general safety related systems domain the key point is assessment of current procedures, and where necessary, development of suitable amendments or additional processes and procedures along with the suitable methods - with other words a classical gap-analysis and mitigation.

Organizational Structure

Any safety related systems has three main threads concurrently executing

  • A safety management process
  • The technical safety process
  • And the Certification Authority (CA) liaison process

For SIL2LinuxMP this overall structure is spread over a number of participants (full participants, reviewing participants, accademic participants) that fulfill different roles. The coordination of the participants work-flow is in the hands of OSADL Safety Critical Linux Working Group and infrastructure wise located at OSADL in Heidelberg. The safety management process will be primarily the responsibility of OSADL with contributions from all participants. At the technical level the roles are more differenciated with full partners contributing domain know-how for their specific use-cases and reviewing partners (as the name indicates) providing a first level of semi-independent review. Specific tasks, be it analytical, the development/tailoring of tools or generation of suitable evidence data will be the main contribution of our accademic partners as well as of OSADL staff.

The liaison process with the CA, which was (informally) initialized with the first kickoff meeting at TueV Rheinland, will be formalized and encompasses all participants. The overall technical infrastructure and infrastructure management (e.g. Internet services such as content management, mailing lists, etc.) is provided by OSADL and its staff.

Scope

The SIL2LinuxMP project aims at the qualification/certification of the base components of an embedded GNU/Linux RTOS running on a multi-core industrial COTS computer board. Base components are boot loader, root filesystem, Linux kernel with a well defined subset of drivers and the C library bindings (glibc using NPTL) to access the Linux kernel. With the exception of a minimal set of utilities (to inspect the system, manage files and start test procedures), user space applications are not in scope.

SIL2LinuxMP Big Picture

Context

An element of a safety related system can only be analyzed and ultimately qualified if it is operating in a defined context. From the perspective of IEC 61508 Ed 2 we see three main building blocks of the sytem context for contemporary systems.

Functionality

Obviously any safety related system will require a set of functionalities from the OS/Hardware to allow providing the specific safety functions for detection, prevention or intervention. While GNU/Linux has a large set of functional capabilities the actual subset of fuctionality needed to implement safety related functions is significantly smaller. There are well defined standard profiles like the POSIX PSE 51..54 minimum real-time system profiles, or API subsets of traditional safety related OS, e.g. OSEK, that could serve as basis for a resonable API subset selection.

For the non-safety related components ideally no constraints are introduced with respect to their API usage, practically though some constraints will also be imposed on the API permitted for the non-safety related applications to ensure key properties like isolation and sound overload behavior.

Communication

Practically all safety-related systems are communication systems, be it for status reporting, monitoring of physically separated computing nodes or critical operations like a software update. Thus, any practically usable GNU/Linux system subset needs to consider the communication demands as well as contraints on communication. This is addressed by the use-cases as well as by selecting appropriate standards to guide us on communication aspects.

Security

IEC 61508 Ed 2 is focussed on functional safety of E/E/PES systems and to achieve this it also needs to take into considereation security issues where necessary:

.... If the hazard analysis identifies that malevolent or unauthorised action, constituting a security threat, as being reasonably foreseeable, then a security threats analysis should be carried out. [IEC 61508-1 Ed 2 7.4.2.3]

putting these issues together we arive at the overall context of SIL2LinuxMP:

SIL2LinuxMP Context

Concluding Remarks

With the growing functional, performance and security demands in industrial safety related systems traditional safety-related OS are showing their age - the future of safety-related computing systems will need to build on powerful mainstream multi-core processors for which GNU/Linux is a good and sometimes the best option.

SIL2LinuxMP or more generally GNU/Linux for safety is part of this future we believe, and after a rigorous investigation of 61508 Ed1 and Ed2 as well as a number of derived application sector standards, we are confident that the framework of 61508 Ed2 is in fact sufficiently robust to handle an open-source project like GNU/Linux for the qualification at SIL2.

OSADL Safety Critical Linux Working Group

References:

  • [1]: IEC 61508-3 Ed 2 7.4.2.6 As far as practicable the design shall keep the safety-related part of the software simple.
  • [2]: it would be correcter to refer to SC2 (Systematic Capability) but it seems common to equate this with SIL2