Managing the risks of genAI through a safety-critical lens

News

Posted on Tuesday 12 May 2026

Today, generative AI (genAI) is everywhere. Text from ChatGPT floods LinkedIn, images from Nano Banana are all over presentations, and Github is full of new software written by Claude Code. Public discourse swings from championing the opportunities it may bring, to boycott campaigns and a focus on the resource exploitation involved. However, in the safety-critical space with which the CfAA is concerned, the risks of genAI are particularly acute. Senior Lecturer, Dr Rob Alexander shares his views on these risks in the face of growing division.

Dr Rob Alexander a white man with grey hair and beard, wears a smart suit jacket and shirt. He is looking directly down the camera lens and the background is a slightly blurred open office/atrium space.

Traversing the unknowns of genAI

Nobody actually knows what genAI is going to do for us, or going to do to us. It is built on technologies that only a minority understand, and that no-one understands deeply or fully in the way we understood the computer systems of the past. We have even less understanding of how this will affect our work, our day to day lives, and society in general. There are so many diverse views on genAI, many of them stridently expressed by unshakably confident people. But many of these views are wholly contradictory. They cannot all be right, but every view has confident advocates; to talk in absolutes or certainties in the current AI climate is foolish or dishonest.

Despite the speed of change in genAI — ChatGPT 3.5 in 2022 was a pale shadow of ChatGPT 5.5 that is current as I write this — many people need to make decisions about genAI now. We need to decide whether and how to use it in our work, and in the work of people we supervise or people that we teach. We have to do this under the above conditions of uncertainty, and under diverse pressures from stakeholders in all directions.

The problem here is that as a society we do not know how to use genAI well, nor do we know how to avoid using it badly. And it is a jagged-edged technology — the jagged frontier between where it works badly and where it works well is very difficult for humans to understand. And, of course, we have had very little time to gain this understanding and increasingly less opportunity to do so as the abilities of genAI advances.

The social context is even less clear. I do not know whether to use the emdash in my writing — a common feature in both professional and academic writing which I was using for years before genAI was available — because its prevalence in genAI writing will make some readers cry ‘AI!’ when they see it.

Taking the appropriate level of responsibility

When we set out to use genAI for some specific purpose, we need to be appropriately responsible. This means that we need to take responsibility for using genAI well, for setting up processes so we can find errors, and for monitoring the quality of our genAI-using processes so that we know about the type and prevalence of errors. Given the knowledge that comes from doing all that, over time, we then need to make decisions about how to change our use of genAI, and whether our specific use of genAI is safe to continue.

In simple uses of genAI, where it is easy to know whether its output is good, and the risks are in any case low, we don’t need to do very much. Generating images for use in the place of traditional clip art or stock photos is low risk, and requires little oversight. We can just use “off the shelf” tools like Gemini or ChatGPT without a formal process around our use.

In more complex uses of genAI, and where the risks are higher, we need to take more care and effort. If we are making decisions about people’s access to credit or housing, training people on how to use our company’s product, or summarising the results of medical investigations, we need to foresee likely failures, assess their impact, and take steps to prevent those failures and detect or mitigate them when they occur. Then, we need to create explicit processes for monitoring what happens and confirming that we are indeed avoiding or mitigating failures. In other words, we need to build a wider “system” around the genAI tool that compensates for its potential to go wrong. The effort involved here may be much more like “building a custom software system” than “using a program that we bought from a vendor”. Given the jagged capabilities of genAI, it may well be that our use that seemed to work well in our prototype is actually unviable due to reliability, safety, or security concerns.

In the safety-critical space with which the CfAA is concerned, the risks of genAI are particularly acute. There are many ways we could use genAI in a safety engineering process — in system modelling, in hazard analysis, in requirements elicitation, in safety case creation, or in accident analysis. These are all activities, however, that are valuable because they generate new information — they discover things that we did not know before. Although they do generate artefacts — hazard lists, fault trees, safety case documents — those artefacts are not the point. In many cases, e.g. the top-level probability on a fault tree, the validity of the artefact is deeply questionable. The value, instead, is the insight we gain into the safety-critical system that we are working with. That is a function of the process more than the result, and there is no way to look at the result alone and judge how well that has happened. It may be possible to benefit from genAI in this kind of use, but we should approach it with great caution.

GenAI could potentially be used for direct control of safety-critical behaviour. For example, it could be used to control treatment actions in healthcare, or to turn verbal instructions into movements by an autonomous car. Both of those uses could be extremely dangerous, and would make safe behaviour depend on a technology radically different to any used in safety-critical systems before, and for which we have almost none of the guarantees that safety-critical software has come to depend on. Before we implement such uses, we should do extensive preparatory work, involving lab research, cross-industry standards-setting, and development of safety analyses using techniques specialised for this purpose. The road to this will be, and should be, long.

More than anything else, as users of genAI we need to take responsibility for the effects that our use has. If our genAI use is likely to have harmful effects, we should foresee this, take steps to mitigate the harms, and monitor our live use. Most organisations will not be able to do this on their own — as a society we will need to develop standards, processes, and practices to do this well. This is where we can lean on established and trusted practices in the safety-critical space and this is an area in which the CfAA is developing its safety research.