Minnu's Blog on Informatica & Data warehouse concepts

Archives

Tuesday, June 10, 2008

Denormalize a single column using Expression

This mapping template uses PowerCenter variable logic to collapse values from source records with duplicate keys into one master record.

Image Hosted by ImageShack.us


Challenge Addressed

Often it is necessary to 'build' a target record from several source records. One way to do this is to use variable logic to determine whether data in the incoming record should be added to data from previous records to form a "master" record.

Overview There are a few different methods of building one record out of several. Often this can be accomplished by simply using an aggregator. Alternatively, variable logic can be utilized. This mapping template illustrates a method of using PowerMart / PowerCenter expression variables to build one record out of several. In this example, the source table may provide several records with the same CUSTOMER_ID. This occurs when the length of a comment for a particular CUSTOMER_ID is longer than the source table's COMMENT field, and the source system must create several records for that CUSTOMER_ID to store the entire comment (a piece of the comment is stored in each record). This template illustrates functionality for concatenating all the appropriate COMMENT fields into one target record with a larger COMMENT field.

For example, source records:
CUSTOMER_ID, SEQUENCE_NUM, COMMENT
1,1,aaa
1,2,bbb
2,1,ccc
2,2,ddd
2,3,eee
3,1,fff

Would turn into target records: CUSTOMER_ID, COUNT, COMMENT
1,2,aaabbb
2,3,cccdddeee
3,1,fff

Implementation Guidelines This mapping achieves its objectives by doing 3 things:
Sorting the input data (with a sorter transformation) by CUSTOMER_ID and then SEQUENCE_NUM, both in ascending order.
Using variable ports to determine whether the incoming CUSTOMER_ID is the same as the previous one. If so, then the COMMENT field from the current and previous records should be concatenated. Otherwise the COMMENT variable should be reset to the current COMMENT field.
Using an aggregator transformation to count the number of records contributing to each outgoing record, and to return the final, "master" record (after all appropriate concatenations have taken place and the COMMENT field is complete).

Pros
Using port variables allows for great flexibility on the condition on which the final record can be built. This is a very simple example, but the same idea can be used for a much more complex situation.

Cons
The developer needs to be very careful of the order of evaluation of ports. The variable logic used here hinges on the fact that ports get evaluated in the following order:
All Input ports (in the order they appear in the transformation);
All Variable ports (in the order they appear in the transformation);
All Output ports (in the order they appear in the transformation).

0 comments: