Sunday, February 26, 2012

Dynamic transformation to upload flat files into database

Hi,

My scenario:

I have 4 different flat files types each having different no. of column, order of columns etc. I want to upload all the 4 types into the same destination table in the SQL database. Before uploading I need to apply transformation to each column in the flat files. The transformations could be like

1) Multipying the source column by 100

2) Put an if condition for 2 source columns and then select one column to be copied into the destination.

I have the flat files schema with me and also all the transformations that are required.

Question:

Can SSIS provide me with a component that can read the flat file schema and the transformations from the database and apply them to the source data and then upload it to the constant destination table? Can derived column transformation be provided with the input columns list and the transformation to be done on each dynamically?

Why I want this way?

In future there can be an addition of extra flat file formats and we want to keep the changes to the SSIS packages to he mininum. Just entereing the addiional schema and transformation details in the database should run the package on the new flat file successfully.

Thanks for your time.

Regards,

$wapnil

spattewar wrote:

I have 4 different flat files types each having different no. of column, order of columns etc.

There is your problem. I doubt you can have a Flat file connection manager to mutate to accommodate a different file structure each time. What I would do is to have 4 single dataflows to load the 4 files into a common destination (staging table/rawfile,etc) and then an extra dataflow to perform the transformations.

|||Rafael is right. You cannot have a Flat File Connection Manager which can accommodate file of different formats. You can use different flat file connection managers to process the files, and apply transformations and then save to a common destination.|||

Okay. understood that a single flat file connention can have only one format attached to it.

Will this be a possiblity?

Create a template package having a single data flow that connects to a file, reads the data into memory, applies transformation and then load that data into a single destination table. Then programmitically load the package and change the objects as required in the code?

Thanks for your time.

Regards,

$wapnil

|||

spattewar wrote:

Okay. understood that a single flat file connention can have only one format attached to it.

Will this be a possiblity?

Create a template package having a single data flow that connects to a file, reads the data into memory, applies transformation and then load that data into a single destination table. Then programmitically load the package and change the objects as required in the code?

Thanks for your time.

Regards,

$wapnil

If you're going to do that, then you might as well write the whole thing in code. The trouble in using a template is the metadata coming in and out of each component in the data flow. They all have to match up and are generally "fixed" at development time. So even if you progmatically change the source, a downstream component may break because the upstream metadata changed.|||

Ok. lets say that mapping a file connection manager to four different files is not possible but then is this possible?

Create a package that transforms a flat file and upload the data into the database. But the transformation can be changed on the fly or the transformation can be picked up from the database. for e.g.

At design time we inserted the transformation lets say using a derived column transformation task. The transformation multiplies all the column in the input by 1000 and gives an output which is further inserted into the destination table. Now if I want to change the transformation from "multiplying by 1000" to "multiplying by 10" on the fly, then is it possible.

Can we make the derived column task to pick up the transformation from the database and create the output.

Thanks for your time.

$wapnil

|||

spattewar wrote:

Ok. lets say that mapping a file connection manager to four different files is not possible but then is this possible?

Create a package that transforms a flat file and upload the data into the database. But the transformation can be changed on the fly or the transformation can be picked up from the database. for e.g.

At design time we inserted the transformation lets say using a derived column transformation task. The transformation multiplies all the column in the input by 1000 and gives an output which is further inserted into the destination table. Now if I want to change the transformation from "multiplying by 1000" to "multiplying by 10" on the fly, then is it possible.

Can we make the derived column task to pick up the transformation from the database and create the output.

Thanks for your time.

$wapnil

Yes... You can use package configurations to achieve this. For that matter, you can store the values in a table and use an Execute SQL task to load a variable with the "multiplication" value, or whatever.

Then, in your derived column, you simply say "[column1] * @.[User::MyVar]". At run time, the variable "MyVar" will pick up the appropriate value. You can have as many as these as you want.

What you cannot do is to alter the number of columns coming into the flat file source.|||

this is great.

sorry to bug you with this, but if I have a transformation which is like

If [file1.column1] = "Good" Then [file2.column1 = file1.column1]

Else [file2.column1 = file1.column2]

End If

How to accomodate this?

Thanks for your time.

$wapnil

|||

spattewar wrote:

this is great.

sorry to bug you with this, but if I have a transformation which is like

If [file1.column1] = "Good" Then [file2.column1 = file1.column1]

Else [file2.column1 = file1.column2]

End If

How to accomodate this?

Thanks for your time.

$wapnil

This doesn't make sense in SSIS terms. You can't update sources. If, however, you have two sources (file1 and file2) that get merged or unioned together, then in a derived column, you can do (and please search the forums. It will help you, I guarantee it):

Derived column: [column3] expresion: [column1] == "Good" ? [column1] : [column2]

[column3] equals [file2.column1]|||

sorry my mistake...it should be

If [file1.column1] = "Good" Then [destination = file1.column1]

Else [destination = file1.column2]

End If

$wapnil

|||

spattewar wrote:

sorry my mistake...it should be

If [file1.column1] = "Good" Then [destination = file1.column1]

Else [destination = file1.column2]

End If

$wapnil

So my example still applies... Please test this stuff so that you can learn it!
In a derived column, create a NEW column and call it what you want. OR, if the data types are the same, you can replace [column1]: (Below is the expression you'd need)

[column1] == "Good" ? [column1] : [column2]|||

going one step ahead in the above aproach.

the below is the table content that is present in the transformation table in our database

SourceColName Transformation DestinationColName

coupon coupon/noofpayments coupon (noofpayments is also a sourcecolname)

quantity quantity*10 quantity

The derived column transformation gets an input which has the following columns coupon,quantity and noofpayments.

Is it possible that the derived column transformation pickes up the transformations from the database and apply it to the input columns.

I am pretty new to this and hence may not be understanding it correctly. apology for the same. But I am just trying to minimize the impact here if any of the transformation changes.

Thank for your time.

Regards,

$wapnil

|||No, this isn't going to be possible.

Perhaps with some advanced scripting, maybe.|||

Bugging you again Phil...

This is to help me finalize my approach of doing the data transformations.

Scenario:

1) 4 different flat file formats

2) 4 transformations set applying on each flat file

3) 1 destination table

Aproaches:

1) Build one file connection, flat file source and one data transformation for each flat file. Create one package for each OR create a single package and have multiple data flow paths in that.

2) Build custom flat file connection, custom flat file source and custom data transformation. Hush.....

Future changes:

There can be more flat file formats added to the list.

Apology for this but at my end here people want to build custom components and use it across projects.......so just trying to find out the fdeasibility.

Thanks again for your time.

Regards,

$wapnil

No comments:

Post a Comment