The standard approach to Big Data in cyber security is like the police investigating a crime scene: analyzing the past. Similarly, we see companies are pouring petabytes worth of historical information into data warehouses and lakes, where it waits to be analyzed by an assortment of sophisticated tools. But do you really want your company to be a crime scene? Of course not–you want to make sure that the “do not cross” tape and chalk outlines don’t exist in the first place.
With intelligent security, the goal of analytics is to not only understand the anatomy of past attacks, but to identify and act on threatening patterns as they occur. Yet, even for companies currently able to analyze historical data alongside current streaming data, stemming current attacks with this information requires tools that go beyond conventional analytics bundles.
The complexity of detecting deviations in real-time
To understand what is abnormal, you need to know what is normal. Detecting and stemming an anomaly that may signify an attack requires establishing a historical baseline that can be quickly compared to what is currently happening. For the massive amounts of legacy data companies are collecting to be useful in security analytics, they must be able to contextualize it immediately by correlating it with streaming data.
So unlocking real-time actionable insight to stem cyber attacks requires the ability to pull together data sets from across the organization–an ability that may be hampered as mountains of data amass in departmental silos, in different formats, making it tough to make sense of the past. But if that were the only problem companies would be a lot safer than they are. The reality is that such correlation of real-time and historical data requires a combination of technologies that address fundamentally different analytics structures, and may be tough to reconcile in a single platform.
Opposing analytics structures in one solution–can it be done?
Batch (historical) data processing involves high volume transaction data collected over a period of time with the main goal of obtaining complete data sets, and organizations have traditionally employed data warehouses to accomplish this. As data volumes have grown, they’ve more recently looked to Hadoop, which uses distributed storage to process data faster and more efficiently. In contrast, real-time data processing involves continual input, processing and output of data, and is undertaken with the primary goal of speed. When your objective involves simultaneous real time and batch processing, Hadoop’s normal MapReduce processing layer is insufficient, and additional technologies such as Storm and Spark are brought in. This architecture, referred to as Lambda, is possible, yet implausible for most companies–building a system may take years and astronomical resources.
When we designed the Logtrust architecture, our objective was to fill in this very significant market gap. But we had our work cut out for us: designing a comprehensive, enterprise-ready solution that can ingest, store, query, and analyze big data in real-time, across silos and formats.
Reconstructing the anatomy of a past attack…to stem one happening right now
Achieving intelligent security through real-time analytics essentially requires the ability to quickly reconstruct network events, and then use that information to not only predict and prevent future attacks, but stop attacks that are occurring in the moment. This requires several capabilities, including the ability to:
- Store, replay and selectively slice and dice historical network sessions for ultra fast analysis
- Perform real-time network topology event analysis, such as identifying live communications occurring between adversaries, and hunting for dynamic event data changes–capabilities requiring a real-time authoritative view of topological event data
- Achieve the same query response time for data-in-motion arriving in the last 10 seconds as data-at-rest 10 years ago.
How Logtrust addresses this problem
Logtrust has solved the real-time security challenge with the ability to ingest, store and analyze massive, varied data sets at blistering speed. It does so with a patented Flat-Ultra-Low-Latency (FULL™) architecture that scales efficiently and linearly, and “always hot” storage that ensures fast query response time, regardless of when or where the data might have been originally stored.
This affords some very unique capabilities. For example, GoNet FPI (Fraud Prevention & Intelligence) leverages Logtrust’s capabilities to monitor 300 million online payments and investigate more than 3,000 cybercrimes every month by correlating and summarizing data across highly varied sources to mitigate its clients’ risk. Logtrust allows them to build sophisticated real-time queries to detect anomalies as security threats occur, ingesting and analyzing millions of events per day from sources ranging from IoT sensors to Tweets. Additionally, Logtrust’s drag and drop UI and intuitive visual analytics afford the ability to detect deviations and spot hidden relationships, drilling down and across to quickly identify suspicious patterns. GoNet leverages this fast visualization and scalability to provide immediate insight and support, giving its customers an exceptional online experience with minimal manual tasks.
In sum, Logtrust enables you to identify and act on threats in real-time, analyzing historical data alongside real-time data to hunt and counter hackers before they leave you with chalk outlines and “do not cross” tape. Miguel Ángel Rojo, CEO of GoNet FPI, explained the value of such capabilities: “Initially we were just looking for a more effective delivery solution for our fraud intelligence services. However, with Logtrust we have also accessed an extraordinary capacity for data integration and presentation, far surpassing our expectations.”