Long-term maintainability of safety properties - why context awareness is so crucial to manage complexity

SIL2LinuxMP uses a "maximize context" approach which means that we do not anticipate any Safety-Element-out-of-Context or generic qualification but a Use-Case driven qualification that harvests all the specific potentials of the systtem and only mitigates relevant failure modes. One reason for this is also to keep the generic elements manageable during the systems life-time. In this short note we outline this based on the 4.14 LTS kernel overall analysis of bug-trends, some subsystems (aproximated by looking at the respective subdirectory) and the result of analyzing the specific patchset that is in our reference Use-Csae kernel specification based on the PIT tool (link to pit tool page).

While linux-stable overall and mm show semi-stable development that is the regression prediction based on a negative-binomial regression (to address over-dispersion) is degressive while the confidence intervals are divergent. contrary to the overall linux-stable development and mm the kernel core code (located in kernel/ of the repository) is instable showing a rising prediction and diverting confidence intervals - finally the filesystem subsystem show very significant instability as a whole. Note that this is a long-term-stable kernel. So all the changes here are back-ports and/or security fixes only no new features since 4.14.0 (which is the starting point of all graphs). So this might point to a highly instable kernel development and the whole idea of using GNU/Linux being plain wrong for safety. The problem is that the Linux kernel is a large set of elements (e.g. 40+ filesystems, 30+ architectures, hundreds of Ethernet drivers, etc.) and looking at the sum total of 20M LoC gives the correct prediction only if one would assume that one is intending to use more or less anything GNU/Linux provides - the conclusion then would be - donĀ“t use it !

If one introduces systematic selection and configuration using conservative selection criteria (subsystems that have been in the kernel for a long time are in wide use and show good overall development meta-data attributes) the picture changes radically. For the same kernel the 5th plot is the ARM64 SIL2LinuxMP reference configuration (network subset, selected filesystem specific drivers, cgroups, namespaces, etc.) and this shows a completely different development. The regression analysis is not only stable for the prediction interval twice the data interval but it is stable over four times the data-interval and further the absolute values of <10 patches applied per sublevel release - which means this is technically manageable with respect to impact analysis and possible system updates.

The significant discrepency between full kernel and selection is in part due to all subsystems being under active development thus it is inherent in the evolutionary development model that some newer filesystems may actually be instable or security issues only emerge once they go into wide use - it is not possible to deduce from regression analysis alone the stability of a system or element - this stability can only be claimed IFF the underlying development-life-cycle carries the necessary attributes with respect to feedback (bug-tracking and testing), monitored deployment (e.g. distribution endorsement) and suitable breadth of work-loads (including users without adequate knowledge as "work-loads"). This is essentially why open-source - IFF analyzed, selected and monitored - carries promissing properties for safe and secure systems.

I hope these 5 figures can make the "in-context" or rather "maximum-context-aware" approach more tangible.

Overall linux-stable

linux-stable kernel core

linux-stable memory management subsystem

ARM64 SIL2LinuxMP minimal configuration

linux-stable filesystem subsystem (prediction window limited to twice the data window)

ARM64 SIL2LinuxMP minimal configuration (prediction window was actually 44000h (5 years) but truncated to have the same x-axis as the other figures - y-axis was left the same as the full kernel so that the impact of the selecting a specific/constraint setups is immediately clear).