Integration Services 2008 has a new Lookup transformation that includes a caching option that allows lookups to execute much faster than before. There are three modes of caching and a new Cache Connection Manager. Let’s take a look…
When adding a Lookup transformation to a data flow task you are requesting just what it says, a lookup. Based on a matching column, a lookup table is used to retrieve other data columns that the data flow can use. For instance, based on an Employee Id we can look up an email address based on that id and add that to the data flow. There are different flows for Match and NoMatch.
If we are dealing with a large amount of data, the lookup may be forced to do repetitive database queries creating a potential bottleneck. Now in SSIS 2008 there is a Caching Mode that defaults to Full. This means that the lookup query is executed before the Lookup transformation itself and the results put into a cache in memory. Then SSIS will use the cache to satisfy the actual lookups giving a significant performance gain. Of course, there must be enough memory available to store the results of the lookup query and it must be recognized that, as with all caching options, the data in the cache may become stale as updates are made to the underlying data.
To optimize memory usage, there is another Caching Mode, namely Partial. This mode will fill the cache based on the matches only so will save memory usage but will issue more database calls than Full mode.
The third mode is None which implies no caching. Well, almost no caching. It will at least cache the last lookup in case there are repetitive lookups in order.
A further new option is an extension of the Full Cache mode which allows you to persist the cache by using a Cache Connection Manager. This option uses a .caw file that is written to disk and then read into memory. This gives improved performance with the added benefit of being able to share the cache between multiple tasks and even other packages.
Just how much performance gain you will get out of the new Lookup transformation is all down to detailed testing using tools such as the SQL Profiler to trace and verify, but we now have some viable options to limit the Lookup bottleneck.
SSIS 2008 Lookup Transformation:
SSIS – Lookup Cache Modes – Full, Partial, None: