Tuesday, November 22, 2011

SSIS Project

My interests are more in more in the Business Intelligence arena, but I am surrounded by conditions where I am working more with interfacing data between systems.  Here's a recent SSIS package that I have been working on that has been fun:

We've been provided a "datamart" by our software vendor with columns representing various configurable measures with one row of data representing information for a site business date combination.

For the purposes of integration, each column of data needs to become the source of data for one row of an entry to our accounting system. This can be accomplished easily using UNPIVOT in T-SQL which will change my columns into the desired rows.

This "datamart" table is WIDE, like 80 columns wide and not all of them populated.  Even nicer, that depending upon the apparent whim of the individual who added these columns, the data types are not consistent.

I could go through the exercise of defining the mapping, create a source to target map and extract the data by making changes to my query, a view, stored procedure to provide the data. But, I am LAZY.  (Not really, I just don't like revisiting things and we are a VERY resource constrained environment.)

Instead, I created a configuration table to define the source and eventual destination for the provided datamart information. Since we are dealing with accounting, the same datamart amount could be used more than once; credit to one account/debit to another.  My configuration table contains both the source register name and a destination description.

Here's where it get's FUN, let's put it all to use in SSIS

We are going to use "SQL command from variable" in our data flow task.  We have to create a variable to hold a starter/dummy UNPIVOT query statement.  This query has to have enough in it to help us define columns.  I created a variable called "Sales_Query" as a string type and package level scope and I used the following starter statement.

Now, we create a little dynamic SQL to use the configuration information to construct my real UNPIVOT query for me.  This will produce a query that selects the columns needed.  Since the data types must all be the same for an UNPIVOT to be successful, converts the data to be consistently numeric instead of the mix of money, float, and numeric.

I can now use my dynamic SQL statement as part of an execute SQL task in my control flow pasting my statement created in SQL Management studio into the SQL Statement.

I have assigned my result to replace the value in Sales_Query variable


I can now create a Data Flow Task to extract the needed information from the sales data mart.

When my package is run, the value of my Sales_Query variable is generated dynamically and I extract the needed information in a format that I can use later in my SSIS package.  I can take my rows of information, join back into my configuration table to assign my needed general ledger accounts for the debits and credits to create the needed rows for a journal entry.

One of the side benefits of being involved with the integration portion with a passion for the BI side, I can get a good handle on both sides of the business data, and maintain tables for integration data to hopefully aid in the eventual measure of how sales information ties to general ledger contribution.

No comments:

Blog Archive