How do you set up SQL queries in your packages? Let's say you have an Execute SQL task and you need to use a large table join to query the source table. There are several ways you can store the queries, and you can even combine some of them. I'll list them here, but I may forget one (or one combination); don't take this as an exclusive list.
[ Cut straight to the key news for technology development and IT management with our once-a-day summary of the top tech news. Subscribe to the InfoWorld Daily newsletter. ]
You can store the query as:
- Stored procedure
- Package variable
- .SQL file on the file system
- Directly in the Execute SQL task as a query
- Directly in the Execute SQL task as a stored procedure
- Directly in the Execute SQL task as a view
- View inside a stored procedure
- Stored procedure in a .sql file on the file system
- Stored procedure in a variable
- View in a variable
- View in a .sql file on the file system
- The variable can also be stored in a config file
I realize that some of these are splitting hairs, but there are a lot of choices to make. Let's assume this query will have 12 joins and return 70 columns, most of which will have a CASE statement or function or other manipulation applied to them. This query could easily be five pages long depending on how you format it. You definitely have to pick the storage method that will be the easiest to maintain.
The first thing you should do is decide how the code will live in relation to the database itself. Should it live inside the DB or the package? What are the factors involved in where you put the code? Well, you have to consider network traffic. If the query's huge, then do you want to push it across several calls on the network? How often will the package run for this to even be an issue? Will this package be rolled out to several servers, and if so, does it need to run with different parameters in the query? How often will the query need to be changed, and what are your change procedures? What's the security like in your database? Will you likely mess with the SP?
Those are a few factors involved in deciding where the query should live. Quite often, the choice is made to use an SP. This is perfectly valid, and some might even say it's a best practice itself. In fact, I often prefer this myself. When using an SP, it's easy for the database admins to support the query without disturbing the package. Since SSIS packages are compiled, we can't just change them on the fly. They have to be loaded into Visual Studio, then recompiled and redeployed. It can be a pain; at the very least, it's a process.