December 18, 2008

ETL - Tips on Transformations - Joiner Transformation

The following tips come from bookshelf, it is not my idea yet, once I complete the case i will change that with the example.
Use the Joiner transformation to join source data from two related heterogeneous sources residing in different locations or file systems. You can also join data from the same source. The Joiner transformation joins sources with at least one matching column. The Joiner transformation uses a condition that matches one or more pairs of columns between the two sources.The two input pipelines include a master pipeline and a detail pipeline or a master and a detail branch. The master pipeline ends at the Joiner transformation, while the detail pipeline continues to the target.
To join more than two sources in a mapping, join the output from the Joiner transformation with another source pipeline. Add Joiner transformations to the mapping until you have joined all the source pipelines.The Joiner transformation accepts input from most transformations. However, consider the following limitations on the pipelines you connect to the Joiner transformation:♦You cannot use a Joiner transformation when either input pipeline contains an Update Strategy transformation.♦You cannot use a Joiner transformation if you connect a Sequence Generator transformation directly before the Joiner transformation.
when you use joiner transformation what you should do as below,
1) Defining a Join Condition;
such as EMP_ID1 = EMP_ID2;
2)Defining the Join Type;
a) Normal Join
With a normal join, the Integration Service discards all rows of data from the master and detail source that do not match, based on the condition.For example, you might have two sources of data for auto parts called PARTS_SIZE and PARTS_COLOR with the following data:










To join the two tables by matching the PART_IDs in both sources, you set the condition as follows:
PART_ID1 = PART_ID2
When you join these tables with a normal join, the result set includes the following data:
The following example shows the equivalent SQL statement:SELECT * FROM PARTS_SIZE, PARTS_COLOR WHERE PARTS_SIZE.PART_ID1 = PARTS_COLOR.PART_ID2;
b) Master Outer Join
A master outer join keeps all rows of data from the detail source and the matching rows from the master source. It discards the unmatched rows from the master source.When you join the sample tables with a master outer join and the same condition, the result set includes the following data:
Because no size is specified for the Fuzzy Dice, the Integration Service populates the field with a NULL.The following example shows the equivalent SQL statement:SELECT * FROM PARTS_SIZE RIGHT OUTER JOIN PARTS_COLOR ON (PARTS_SIZE.PART_ID1 = PARTS_COLOR.PART_ID2);
c)Detail Outer Join
A detail outer join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source.When you join the sample tables with a detail outer join and the same condition, the result set includes the following data:
Because no color is specified for the Ash Tray, the Integration Service populates the field with a NULL.
The following example shows the equivalent SQL statement:SELECT * FROM PARTS_SIZE LEFT OUTER JOIN PARTS_COLOR ON (PARTS_COLOR.PART_ID2 = PARTS_SIZE.PART_ID1);
d) Full Outer Join
A full outer join keeps all rows of data from both the master and detail sources.When you join the sample tables with a full outer join and the same condition, the result set includes:
Because no color is specified for the Ash Tray and no size is specified for the Fuzzy Dice, the Integration Service populates the fields with NULL.The following example shows the equivalent SQL statement:SELECT * FROM PARTS_SIZE FULL OUTER JOIN PARTS_COLOR ON (PARTS_SIZE.PART_ID1 = PARTS_COLOR.PART_ID2)