Adaptive and Intelligent Data Collection

Updated: Dec 5, 2019

The security of critical financial infrastructure and services must be tracked and maintained through the collection and analysis of security-related data in an intelligent, resilient, efficient, secure and timely manner. Making security data collection and analysis intelligent and capable of quickly spotting, learning from, and addressing zero-day threats is essential to economizing of resources and accessing the right information at the right time through the configuration of configurable data collection probes and the adaptation of different collection strategies. The nature and quality of collected data affects the efficiency and accuracy of methods of attack detection and defence. The detection capability can thus greatly be improved by correlating wide-ranging data sources and by predictive analytics. Managing appropriate levels and types of intelligence and adaptability of security monitoring is achieved through different means for adaptive data collection and predictive analytics. This is important for physical and cyber security as a means of tuning the rate of the data collection at the various monitoring probes.

For this purpose, FINSEC has refined the concept of adaptive multi-layer data collection by adapting different approaches. Adaptability refers to how a collection mechanism can adjust to different environmental contexts and situations. The project integrates smart security probes and a set of adaptive strategies for the multi-layer data collection functionality, which includes rendering adaptiveness and intelligence, optimizing bandwidth and storage of security information, and boosting the intelligence of the probes. Security data analytics methods are integrated in the process at appropriate level-specific analytics. While predictive/regression algorithms such as linear regression, Support Vector Regression, logistic regression, KNN regression, and Random Forest will be considered for the lightweight analysis of adaptive strategies, deep learning mechanisms will be considered for the identification of complex risk and attack patterns. A set of rules (both static and adaptive) will be defined for data processing and analysis, configuration, collection, and adaptation. The figure depicts the architecture of adaptive multi-layer data collection and analysis, which extends the classical data collection and analytics process that includes data collection, data parse, data analysis and data processing. The approach makes this process adaptive by introducing feedback control phase and letting the data collection depends on the result of the last data processed.

The process modules include data collector (Monitor), data parser & analyzer (Analyzer), and data processor (Adapter). The arrow between modules is data flow and control direction. The Monitor module instructs the multi-layer probe APIs, e.g. skydive, to collect data. The multi-layer probe APIs/Event APIs collect data from cyber and physical assets at different levels (individual asset, combined assets, integrated process, and supply chain) and store the data in the DB, e.g., Elasticsearch, and notify the Monitor module. The Monitor module inputs the data to the Analyzer module. This module maps to the Data Collection module in the FINSEC reference architecture. The Analyzer module transforms the raw data to standard data in accordance with the data model defined in the project, calls predictive analytics such as anomaly detection to analyse the data and converts the standard data to service data (threats, anomalies, attacks, etc.). Further, it passes the service data to the Adapter module. This module maps to the Predictive, Anomaly detection and Risk Assessment services in the FINSEC reference architecture. The Adapter module disposes the service data depending on its value such that it adapts collection strategies and controls the Monitor module, sends notification to external modules such as alarms and/or data visualization tool or database. This module maps to the Actuation and Actuation Enabler in the FINSEC reference architecture. The issue of false positives is addressed to ensure reliability and accuracy, and the privacy and security of the collected and analyzed data is protected using encryption.

In this way, increased automation and optimization of bandwidth and storage of security information is achieved using adaptive collection strategies such as security threats, content variation, collection/sampling rate, bandwidth variation/communication dynamics, application needs, context changes, and storage needs. In this first phase, a prototype implementation of the adaptive and intelligent security monitoring infrastructure is provided, which covers predictive analytics describing the most relevant approaches to analyze the collected data and detect attack patterns. In addition, the security threat and collection rate strategies are implemented. Various alternative adaptive strategies are also defined: (i) application layer adaptive collection strategies (Request start duration, Request duration, Request management duration, Response duration, and Next request start duration), (ii) adaptive techniques for data acquisition for anomaly detection (More historical data, Physical measurement, Change of acquisition, and Rate of acquisition), and (iii) Adaptive data collection for enhanced security analysis (Data Collection manager for reconfiguring the infrastructure of XL-SIEM agents, Threat intelligence update service, and Adaptive security module, which analyses the events and alarms generated). The combination of these three architectural elements implement a feedback loop of collection, detection and prevention that allows for early detection of security compromises and consistently makes security analysis more effective.

By Habtamu Abie, NRS


This project has received funding from the European Union’s
Horizon 2020 research and innovation programme under grant agreement No 786727 
The content reflects only the authors’ views, and the European Commission is not responsible for any use that may be made of the information it contains.