Anomaly detection is a family of analytic techniques that learns typical properties of a system and reports significant deviations from the typical system’s properties as outliers. Anomaly detection is frequently used in state-of-the-art Intrusion Detection Systems (IDSs) because it can provide protection from new zero-day attacks whenever these attacks lead to deviations from typical behaviours of the system. Another advantage of Anomaly detection is that they don’t require a balanced training set in which both malicious and benign events are equally represented. These techniques are a better fit for real industrial systems where malicious events are much more rare than benign events.
In FINSEC we model the behaviour of both cyber and physical entities. This is done using physical (e.g. cameras) and cyber probes (e.g. Skydive, IDS, etc.) that analyse events and stream them to the analytics module. Together they capture a complete cyber-physical behaviour model of the Financial Sector over Infrastructure. Due to the unique characteristics of these data sources, it is important to construct behavioural models for each, separately and as a whole. Treating each separately allows the module to identify anomalies that are either purely physical, or on the other hand, without any manifestation in the physical world. While understanding their underlying connection, allows to correlate and understand events from both domains.
Scalability of the solution is another challenge that had to be tackled at different levels. At the architectural level, the tool is based on a state-of-the-art of map-reduce platform, Apache Spark. The tool itself is implemented in python to leverage a rich toolset and a large community supporting state-of-the-art machine learning tools. The data sources for the tool, Kafka Stream and an ElasticSearch database are also designed to support high volume data scenarios.
Another important challenge is to reduce the false positive rate, which is often the major drawback of anomaly detection techniques. We apply a number of methods to address this challenge:
- Careful selection of analytics that produce clear meaningful alerts like Data Leakage, Reconnaissance attack, etc.
- On-line learning that adaptively learns changes in the system's behaviour.
- Alert budgeting that adaptively select a proper threshold to control the number of alerts without missing the most critical ones.
By Omri Soceanu, IBM