BI Quest [A blog on Informatica]

Share what you learn

Archive for the ‘Session Labyrinth’ Category

Session myraid posts

Using File List as source to read from multiple flat files

Posted by Ankur Nigam on August 25, 2011

In production scenario sometimes you can land up in a situation where you are asked to read from multiple files having different data but same architecture. This is quite in contrast to situations where you need to output in different files. Informatica allows you to read from multiple files in a single go, provided the columns are same across all files. When we set indirect as a source option in session, the informatica architecture reads the file names from the file list you provided and one-by-one starts loading them in the target.

In my scenario which we are going to walk-through, I will create a mapping that reads data of items from different category level files and push them in the items target table in oracle.

The very first step is to import your source file and define its attributes.

For now don’t worry about properties other than delimiters and column lengths. Next create a mapping to something like this given below:-

This is a fairly simple mapping wherein I am fetching the category ID from the category name received in file and pushing all the data down to my target.

Now lets open our session configuration and set the property Source filetype over there.

Set it as Indirect which means that whatever filename I provide (in variable SourceInpFile) contains the names of the files having actual data. Below is the example of my StockItems file having filenames of different data files. I have assigned this filename value to the sourcefilename variable.

When I run the session all the files will be read one by one and the data will be loaded in the target

Data will be loaded in the target:-

Posted in Session Labyrinth | Tagged: , , , | 3 Comments »

Understanding Treat Source Rows property and Target “Insert, Update…” properties

Posted by Ankur Nigam on August 17, 2011

Informatica has plethora of options to perform IUD (Insert, Update, Delete) operations on tables. One of the most common method is using the Update strategy, while the underdog is using Treat Source Rows set to {Insert; Update; Delete} and not data driven. I will be focusing on latter in this topic.

In simple terms when you set the Treat Source Rows property it indicates Informatica that the row has to be tagged as Insert or Update or Delete. This property coupled with target level property of allowing Insert, Update, Delete works out wonders even in absence of Update Strategy. This also leads to a clear-cut mapping design. I am not opposing the use of Update Strategy but in some situations this leads to a slight openness in the mapping wherein I don’t have to peek into the reason of action the Strategy is performing e.g. IIF(ISNULL(PK)=1,DD_INSERT,DD_UPDATE).

Lets buckle up our belts and go on a ride to understand the use of these properties.

Assume a scenario where I have following Table Structure in Stage

Keeping things simple the target table would be something like this

As you can see the target has UserID as a surrogate key which I will populate through a sequence. Also note that Username is unique.

Now I have a scenario where I have to update the existing records and insert the new ones as supplied in the staging table.

Before beginning with writing code, lets first understand TSA and target properties is more detail. Treat Source Rows accepts 4 types of settings:

  1. Insert :- When I set this option Informatica will mark all rows read from source as Insert. Means that the rows will only be inserted.
  2. Update :- When I set this option Informatica will mark all rows read from source as Update. It means that rows when arrive target they have to be updated in it.
  3. Delete :- The rows will be marked as “to be deleted” from target once having been read from Source.
  4. Data Driven :- This indicates Informatica that we are using an update strategy to indicate what has to be done with rows. So no marking will be done when rows are read from source. Infact what has to be done with rows arriving to target will be decided immediately before any IUD operation on target

However setting TSA alone will not let you modify rows in the target. Each target in itself should be able to accept or I should say allow IUD operations. So when you have set TSA property you have to also set the target level property also that whether the rows can be inserted, updated or deleted from the target. This can be done in following ways:-

Insert and delete are self-explanatory however update has been categorized into 3 sections. Please note that setting any of them will allow update on your tables:-

  1. Update as Update :- This is simple property which says that if the row arrives target, it has to be updated in target. So if you check the logs Informatica will generate an Update template something like “UPDATE INFA_TARGET_RECORDS SET EMAIL = ? WHERE USERNAME = ?”
  2. Update as Insert :- This means that when row arrives target and it is a row which has to be updated, then the update behaviour should be to insert this row in target. In this case Informatica will not generate any update template for the target instead the incoming row will be inserted using the template “INSERT INTO INFA_TARGET_RECORDS(USERID,USERNAME,EMAIL)  VALUES ( ?, ?, ?)”
  3. Update else Insert :- Means that the incoming row flagged as update should be either updated or inserted. In a nutshell it means that if any key column is present in the incoming row which also exists in target then Informatica will intelligently update that row in target. In case if the incoming key column is not present in the target the row will be inserted.

PS :- The last two properties require you to set the Insert property of target also because if this is not checked then Update as Insert & Update else Insert will not work and session will fail stating that the target does not allows Insert. Why? Well its simple because these update clauses have insert hidden in them.

Ok enough of theories? Fine … let’s get our hand dirty. Coming back to our scenario, we have the rows read from source and want them to be either inserted or updated in target depending upon the status of rows i.e. whether they are present in the target or not. My mapping looks something like this:

Here I have used a lookup table to fetch user ID for a username incoming from stage. In the router following has been set:-

The output from router is sent to respective instances of the target (INFA_TARGET_RECORDS) in case if user exists or not. INFA_TARGET_RECORDS_NEW in case of new records and INFA_TARGET_RECORDS_UPD in case of existing records.

Once this is in place I have to set the Treat Source Rows property as Update for this session. Also to enable Informatica to insert in the table I will have to :-

  1. Set the Insert & Update as Insert properties of the instance INFA_TARGET_RECORDS_NEW.
  2. Set Update as Update property for INFA_TARGET_RECORDS_UPD instance in the session.

What actually happened is that I have treated all rows from source to be flagged as update. Secondly I have modified the behaviour of the Update and set it as Update as Insert. Due to this property update has allowed me actually to insert the rows in target. When the session runs it will update the rows in target and insert the new rows in target (actually update as insert).

Try it out and let me know if it works for you. I am not attaching any run demo because its better if you do it and understand even more clearly what is happening behind the scenes.

Posted in Session Labyrinth | Tagged: , | 17 Comments »