Initially, you have to decide how to slice it. Do you want to do it by table activity (putting the most updated tables together), by relationship (PK/FK tables go together), or by size (big tables get their own publication)? You must make these calls when you do replication, as well as when you write an ETL process.
This post is the short Web version of the SSIS architecture discussion. If you want to see the full analysis on these points, you can download the PDF.
Here are the general guidelines we landed on, with some more detailed discussion on each.
Put your tables in a single package when:
- You have a lot of interdependencies between tables.
- Network, data, and servers are really stable (won't need to restart often).
- You may need to change package-level info frequently.
- You have non-SQL dependencies.
That's mostly it for single package design, but if I left out any major details, send it to me and I'll add it to the list.
Now on to multiple package design -- here are the factors that could lead you down this road:
- Network, data, servers aren't reliable and process is likely to need to be restarted often.
- Tables are more autonomous (not deciding factor on its own).
- You aren't doing any logging, or logging is set in stone and won't change.
Despite all of this discussion, you'll be glad to know that you're not limited to two options. It's more than just a decision to put your tables all together or by themselves -- that's the good news. The bad news is that your choice on how to architect your package got a lot harder. Here are, more or less, your full list of alternatives:
- You can use a single package if it's small and stable enough.
- You can put all tables in their own package.
- You can use a master package.
- You can group tables in packages.
There's a lot more discussion on this topic that would have made this post too long. If you're interested in the finer points, I urge you to download the PDF to understand the full implication of what I'm saying. Just remember to always have a plan and know what you want to get out of your process. That way, you can evaluate your process against your blueprint and see if you've accomplished your goals.
Feel free to email me with questions or comments.
Also, here's the video of the presentation if you'd like to hear the dialogue in real time.
This article, "SSIS architecture: Tips for package design," was originally published at InfoWorld.com. Read more of Sean McCown's Database Underground blog and follow the latest developments in databases and data management at InfoWorld.com.