Tracking cybersecurity threats with Machine Learning to spot patterns: part two

In part 1 we looked at some of the challenges of identifying cyber attacks in a vehicle. In part 2, we will discuss the approach that we are taking in the Secure-CAV project.

At the University of Southampton we are working with engineers from Siemens to include our techniques inside their chip designs. The complete design fits into the demonstrator developed at Coventry University. The team of cybersecurity specialists at Copper Horse then evaluate the robustness of the security of the system. Although we are developing a proof of concept, it’s essential that our code is sufficiently robust to pass all of these engineering tests and to at least meet industry safety and security standards.

The approach we are taking is to look for anomalous behaviour on the CAN bus inside a vehicle. As we noted in part 1, we first have to define what we mean by normal behaviour, but this will vary according to the situation (for example, town or motorway) and between drivers.

Machine Learning (ML) is a name given to a wide range of methods. At root, however, ML is really an adaptive statistical approach. We identify a feature of interest in the communications on the system bus and then model how it changes over time. For example, we might look at how frequently the driver indicates a turn. This behaviour is modelled as the coefficients in an equation. In other words, the model is trained by calculating appropriate values of these coefficients. This model is then used to predict when we might expect the feature to next become active.

Such ML techniques are relatively simple to implement, but the quality of the results depends on quite a lot of data for ‘training’.

Our work on using ML to model anomalous behaviour grew out of a project that was looking at reliability within a microprocessor. We were considering safety rather than cybersecurity. This raises some very interesting questions.

No ML system can ever be perfect. The system will make incorrect predictions. Sometimes it will indicate an anomaly when none exists. Other times it will miss anomalies. The first case is known as a False Positive (FP) and the second is a False Negative (FN). We can bias the system to favour FPs over FNs, or the opposite.

From the point of view of safety, FPs are annoying, but FNs can be deadly. This may be less true for security, but if a user constantly gets false alarms they will eventually learn to ignore or disable any warnings. Therefore, we need to be careful how we handle any warnings.

Then there is the question of how to respond to a warning. The safest response to an anomaly may be to get the vehicle to the side of the road and stop. But this could be exactly what an attacker is trying to achieve. Ultimately these decisions need to be thought about in the context of things like insurance.

We cannot hope to answer all of these questions within the Secure-CAV project. We regard our on-chip ML detectors as a first-level filter. We foresee that, over time, the data collected by such detectors will be aggregated over many vehicles. Indeed, it seems likely that many cyber attacks will be aimed at numerous vehicles rather than at one specific driver and thus the decisions about how to respond to an attack will be made in the cloud using this aggregated data.

You May Also Like