On Premature Optimization

Here is a true story from today. A friend and I went to a local coffee shop to have some pastries and coffee. "We are going to have a lemon bar...", start I, only to be interrupted by a friendly barista: "Aren't you going to have drinks, too? Then order them first. It'll be faster!" Ok, we order coffee, he writes up our order on coffee cups, then we get the pastries, then the same guy makes our coffee. Of course, since he did everything serially, there is no way this was any "faster"!

Now, this is a great coffee shop, and the workers usually know what they are doing. Normally, there are two people filling orders, and in this case indeed starting the drinks first would be faster. But in a different situation with only one worker present, the guy's intuition was wrong. The algorithm that was great for two "processors", was not so great for one.

What's the moral of the story? We all like optimizations, but faced with new unfamiliar circumstances, our intuition often fails us. And what does it have to do with CEP? CEP is still fairly new to most people who build CEP applications. I wrote in my previous post that one of the first questions we hear from people is "how fast is your engine". Now, you will not be surprised that one of the more common follow up questions is "Will I make it faster if I do X"? Often, this is before the requirements are even firmed up, let alone the application is coded and profiled!

Now, CEP engines are complex beasts, and they contain numerous internal optimizations. The Coral8 Engine, for example, contains a large number of optimizations, some of which, e.g. push filter to before joins are familiar to SQL developers. Other optimizations, however, are completely new: automatic data indexing, optimized memory management to conserve space, sophisticated data caching, a unique threading model to limit context switches and take advantage of multiple CPU cores, a very lightweight messaging layer, and many others. We are taking great care to optimize throughput, latency, and resource consumption.

The flip side of this, however, is that it may not be immediately obvious to someone how fast their first Coral8 application will run on our engine. Therefore, our advice to all Coral8 developers is very simple: don't worry about performance at first. Write your application, or a representative portion of it first, and then run it. The Coral8 Engine will tell you how much CPU or memory it takes, what the throughput and latency are, and will help you pinpoint the problems. Then, if necessary, you can start your optimizations. But chances are, you won't have to: The vast majority of our customers are surprised at how fast their application is once they write it.

Anyway, I know I am not saying anything new here. Don Knuth said over three decades ago that premature optimization is the root of all evil. This is especially true when you are working in a new and unfamiliar environment. The SQL-based language used by Coral8 and other CEP engines looks familiar, and indeed the similarity with SQL makes programming simpler. But it helps to remember that the implementation of any CEP engine is very different from that of relational database, and is already highly optimized for CEP applications to begin with. So it's best to build the application first, measure the performance, and then start worrying about optimizing it if necessary.

Mark Tsimelzon
President & CTO, Coral8