Complex Event Processing

In many areas of business, the detection of threats and opportunities and quick responses to them are of vital importance. The reality is such a signal is always derived from lower-level events collected from diverse sources, often across information domains and spread in time.

Challenging and Complex Event Processing
Complex Event Data Analysis and Processing
Omnichannel Modernization and Complex Event Processing

These events are collected, filtered, aggregated, and processed into complex events, and business-level alerts and are only decipherable by interpreting the combination of a few complex events at the same time. Complex Event Processing (CEP) is a specialty dedicated to address the technical challenges in finding these meaningful business indicators out of a sea of chaotic signals.

There are a number of off-the-shelf products trying to address the real-time processing requirements and computational functions specific to CEP domains. Some build in their own event query language, others leave the user to program Machine Learning and data mining methods. To facilitate rapid response to the detection of critical events, these CEP products offer adaptors to integrate into business process management software, time-series databases, and financial trading systems.

CEP in the Era of Big Data

These early CEP products already incorporated a number of performance enhancement features such as parallelization and data caching. In the era of Big Data, however, data volume is measured in petabytes instead of terabytes. A Boeing 787 aircraft, for example, generates 40TB per hour of flight. Massive challenges arise in managing this data and making it “useful”.

With the sheer growth in data volume, follows the inherent complexity in the data. Data is ingested with or without schema, in textual, audio, video, imagery and binary forms, sometimes multi-lingual and often encrypted, but almost always with real-time velocity.

The initial technology challenge in harnessing CEP is an infrastructural upgrade to address the data storage, integration, and analytic requirements, the end goal is to generate meaningful business insights from the ocean of data that can translate to strategic business advantages.

A new generation of architecture and engineering disciplines must be introduced into the practice; among them parallel streaming, linearly scalable databases designed for time series data, and in-memory computing. Fortunately, there are great open source applications and frameworks such as Spark and Hadoop that have emerged to address these challenges. Similarly, advances in Deep Learning Neural Networks, and CEP technologies help drive ever-more sophisticated analyses.

Rule Generation

The branch of Machine Learning most central to CEP is automated rule generation. These rules represent causal relations between the observed events (e.g., noise, vibration) and the phenomena to be detected (worn washer). However, these rules are much too complex for human experts to deal with.

Consider the sensor signals that could indicate pending trouble. The challenges include:

  • Unknown combination, of low-level events
  • Masqueraded in irregular temporal patterns
  • Impossibly wide ranges of time scale – milliseconds to decades
  • Further confounded by anomalies such as sporadicity or outliers

Only Machine Learning techniques can overcome both the challenge of collecting, preparing and fusing the massive amounts of data into useful feature sets, and extract the event patterns that can be inducted as readable rules for predicting a future recurrence of a suspect phenomenon.

Deep Learning for CEP

Neural Networks have many key characteristics which make it an attractive and typically the default option for very complex modeling such as those found in CEP applications. Sensor data is voluminous with complex patterns (especially temporal patterns), both fall under the strengths of Neural Networks. The variety of data representations make feature engineering difficult for CEP, but Neural Networks automate feature engineering. Neural Networks also excel in cross-modality learning; matching the multiple modalities found in CEP.

There are numerous additional strengths in the Deep Learning approach:

  • Full expressiveness using non-linear transformations
  • Robustness to unintended feature correlations
  • Allows extraction of learned features
  • Can stop training anytime and reap the rewards
  • Results improve with more data
  • World-class pattern recognition capabilities

However, the complexity and risks associated with a Deep Learning implementation should be weighted carefully against its modeling power. Consider some well known challenges:

  • Slow to train – high iterations and many hyperparameters translate to significant computing time
  • Black Box Paradigm – subject matter experts cannot make sense of the connections to improve results
  • Practitioners generally resort to specialized (often custom) hardware to achieve desired performance

In certain cases, the performance and cost associated with Deep Learning Neural Networks motivates alternative approaches. One simpler technique BigR.io often champions is a specialty CEP engine which is optimized for flowing sensor data.

Specialty CEP Engine

In this approach, we look at the CEP rule extraction challenge not as a generalized Machine Learning problem, but rather one which is characterized by some known aspects:

  • Voluminous and flowing data
  • The input is one or more event traces
  • Temporal pattern plays a prominent role besides event types and attributes
  • A decomposable problem into time window, sequence, and conjunctive relationships
  • The event sequence and their time relationship forms large grains of composite events
  • The conjunction of the composite events formulates describable rules for predicting suspect phenomenon

This Specialty CEP engine represents a practical tradeoff between expressiveness and performance. Where a comparable CEP study may require days to process, this specialized engine may complete its task in under an hour. Parallelization based on in-memory technologies such as Apache Spark may soon lead to real-time or near real-time CEP analysis. Unlike the case of a Neural Network, a subject matter expert can make sense of the results from this engine, and may be able to manually optimize the rules through iterations.

BigR.io’s team of highly-trained specialists are well equipped to take on these implementation challenges. We select from a host of available platforms including Apache Spark, Nvidia CUDA, or HP Distributed Mesh Computing. Often, having the necessary intuitions derived from experience can expedite the completion of training by an order of magnitude.

Scroll top of the page