Why is the SSIS Lookup Component Case Sensitive (or isn’t it)?

Why is the SSIS Lookup Component Case Sensitive (or isn’t it)?

Many websites and blog posts will tell you that the SSIS Lookup Component is Case Sensitive (CS). And while they are correct that there isn’t a direct property to change the case sensitivity of the component, it can however be used in a non-case sensitive way.

How to make it Case Insensitive (CI)

For those of you looking for a quick answer: by setting the lookup component’s cache mode to Partial Cache or No Cache you can make the lookup component use the case sensitivity setting of the database.

160830_1
Check the property window of your database if you want to know the case sensitivity setting.

Why does the cache mode affect the case sensitivity (CS)?

Still reading? Great! Consider yourself part of the small group of readers who isn’t content with the how, but also want to know the why. The reason SSIS defaults to CS is found within the internal engine of SSIS. More specifically, in the way SSIS creates Cache files. When the lookup component is set to Full Cache mode, SSIS retrieves the entire lookup table, stores it in-memory and does the comparisons there. When doing the comparisons in-memory, SSIS uses the .NET string comparison function which is case (and space) sensitive by default. With the Lookup set to Partial or No Cache SSIS hands the comparison to the database engine, which uses its local DB settings to do the comparison. So if your lookup DB’s collation is set to case insensitive(CI) and you use Partial or No Cache, the lookup is case insensitive.

160830_2

In SSIS the default setting for a Lookup Component’s Cache mode is Full Cache. And in most cases, this is the best setting. But it is just a setting, and can – and sometimes should – be changed

Any drawbacks?

Now that you know how and why, there is only one question remaining. Should I…

Disabling the cache can have a serious impact on the performance of your package. Especially if you have a lot of records entering your lookup. So even though you may want to do a case insensitive lookup, disabling the cache might not be the best solution. Should performance reasons demand a Full Cache mode (they often will), you can always use the old but tried workaround of using the UPPER or LOWER function on both the input and the reference query to fake a “Case insensitive” lookup.

Conclusion

While it is possible to make the lookup component case insensitive, it often won’t be a practical solution. But at least now you know why it is case sensitive by default. For more information on the different cache settings, have a look at my previous post: Overlooked Properties in SSIS: Lookup Component Cache Mode.