CEP and SQL: The Top Five Myths

As everybody reading this blog knows by now, the Coral8 Engine is programmed via a SQL-based language called CCL. I know at least two other companies that use a SQL-based language for CEP (or ESP; or CESP -- choose the acronym you like best. I consider them all to be largely one and the same, but that's not the topic for this post). Yet there is an amazing degree of misunderstanding over what it means to use a SQL-based language for CEP.

Some of this misunderstanding stems from marketing done by the vendors who use non-SQL-based languages for CEP, but some of it reflects genuine confusion. A SQL-based language is not the same thing as SQL! SQL would indeed be unsuitable for CEP, but CCL is a language designed specifically for CEP. So let's examine the top five myths about using a SQL-based language for CEP.

Myth 1: the request-response nature of SQL makes it unsuitable for low-latency event processing

It's a very good point, but CCL is not a request-response language. CCL (Continuous Computation Language) uses a completely different processing model in which queries subscribe to input streams, run continuously, and publish results into output streams. This is what gives the Coral8 Engine its amazing low latency (about one millisecond) .

Myth 2: SQL is good for handling sets, but events do not come in sets.

It's true that events do not come in sets. Events come in streams. A stream may be thought of as an ordered set of events of infinite size. But storing and operating on infinite sets is inconvenient, so CCL includes a feature called "windows". A window creates a bounded set. For example, the window "KEEP 10 SECONDS" always keeps the last ten seconds' worth of data. The window "KEEP 100 ROWS" always keeps the last 100 events, and the window "KEEP 10 ROWS KEEP 10 SECONDS" keeps up to 100 events from the past ten seconds. More interesting windows are possible: for example,, the window "KEEP 10 LARGEST ROWS BY PRICE" keeps the ten largest events by price. Once windows are defined, they may be used in SQL-like queries just like tables!

Myth 3: SQL-based systems do not include the notion of time.

Indeed, SQL does not have a notion of time, but CCL does. Every CCL message has a timestamp, and CCL has a number of features for dealing with time, such as built-in time-based windows (KEEP 10 SECOND), output control (OUTPUT EVERY 1 MINUTE), and so on. Time is central to CCL!

Myth 4: SQL-based systems do not handle delayed messages, out-of-order messages, revisions, etc.

Databases do not do this, but the Coral8 CEP engine certainly does. It has built-in configuration parameters for handling delayed and out-of-order messages. Coral8 windows are powerful enough to handle revisions. Data stored in a window can be easily removed or replaced when a revision arrives.

Myth 5: it's hard to code complex event patterns with SQL-based systems.

Yes, SQL makes it hard to write a statement that checks whether a certain sequence of events, say A, then B, then C or D, then E and F happen within a certain period of time, like ten minutes. But CCL has a powerful built-in event pattern matching engine, making it easy to write clauses such as MATCHING [10 MINUTES: A, B, C || D, E && F]. The syntax even supports negative events (events that do NOT occur), using the !A notation. Patterns may be arbitrarily nested, too!

As we've seen, CCL is significantly more powerful than SQL. For the mathematically inclined, CCL is Turing-complete, and SQL is not (don't worry if you don't know what this means). At the same time, the fact that CCL is based on SQL makes it very easy to learn. So you get the best of both worlds: a powerful language designed for CEP that uses the syntax and concepts of the language you already know! Sounds like a good deal to me.