Thursday, August 2, 2007

BAM real-time aggregation vs. scheduled aggregation

BAM supports two types of aggregations: real-time aggregation (RTA) and scheduled aggregation. Some customers have asked what are the differences between them and when to choose what.

The biggest difference is the underling storage. The storage of RTA is a SQL table and the aggregation is updated and maintained by SQL trigger. The event importing and aggregation update are completed in the same transaction, therefore its data latency is negligible and almost real-time. The scheduled aggregation is saved in Olap cubes which need to be processed periodically by the cubing DTS package (btw the Olap cubes and the cubing DTS package are all dynamically created by the BAM Manager command line utility bm.exe).

The biggest advantages of RTA are its zero-latency and low maintenance. Once your new instance data is in the BAM database, the aggregation instantly reflects the new data. And there's no need to run any DTS package, the sql trigger takes care of everything for you automatically. Compared to scheduled aggregation though, RTA doesn't support as many dimensions and measures, and it can only keep relatively short period of data to keep satisfactory query performance. Scheduled aggregation, on the other hand, have little problem handling years of years of enterprise data, and can support much more dimensions and measures. Scheduled aggregation's advantanges don't come without a price -- it needs Analysis Service licence, a star-schema database and scheduling of the cubing DTS. And the new instance data won't make into the cubes until the next cubing DTS run.

After you understand the pros and cons of both aggregations, it's easy to understand their usage scenarios: use RTA for time-critical, small set of Key Performance Indicators (KPI) type of tracking and use scheduled aggregation for historcail/trend analysis which normally involves large volume of data and requires complex cubes.

Source: BAM, BizTalk and Beyond

No comments: