Skip to main content

SSIS_Incremental Loading


Types of Data Loading:

1.     Full Data Loading:

2.     Incremental or Delta or Partial Loading

Solutions for Incremental Loading:

1.     Use Lookup

2.     Use Slowly Changing Dimension Transformation(SCD):

o   This is used to capture history of the row.

o   This is used in Data Warehouse (OLAP)

o   There are 3 types of SCD

§  Type 0 SCD: Changes treated as errors (Nobody can change the data, like Name, Gender, and DOB. If it is changed by anybody it will give error)

§  Type 1 SCD: It will over write the row (If any changes will happen in the source, the changes will be overwrite in destination.)

§  Type 2 SCD: It will capture history of a row (If any changes happen in source, the same row get copied as new row in destination.)

o   Note:

o   OLTP system cannot store history of a row.

o   SCD will support Insert and update but it will not support Delete.

o   Surrogate Key: A sequence number in the DWH system.

o   SQL Server 2012 has Change Data Capture (CDC) to store history of OLTP data.

                 


3.     Use Merge Statements in SQL(Recommended):

·         To perform incremental loading

·         It is one of the DML statement placed in SQL Server 2008.

·         Merge statement is a combination statement that can perform INSERT, UPDATE , DELETE statements based on whether rows that match the selection criteria exists in the target table or not.

Syntax:

MERGE Target_Table as target

USING Source_Table as source

ON condition

WHEN MATCHED THEN

UPDATE

SET columns=source.column,....

WHEN TARGET NOT MATCHED THEN

INSERT VALUES (source.column1, source.column2,)

4.     Use Customized SQL code/SSIS Code

5.     Use Set Operators (Except, Intersect, Union etc.)

Q. What is a pre-requisite to perform an incremental loading?

=> Source system should have a primary key. 

Popular posts from this blog

How to remove special characters in the Data(using Script component Task and Regex)

Start with the Below screen shot ( Before Script Task) and you will come to know about the senario. Note : The Script Component also works for the special characters within the Data as well . for ex "A$hish" Hover over the below Regex. it will show > Show potential fixes > in that Select System.Text.RegularExpressions. The Below will disappear.  Use the Below Script.  public override void Input0_ProcessInputRow(Input0Buffer Row) { Row.CustomerID = RemoveSpecialCharacters(Row.CustomerID); Row.NameStyle = RemoveSpecialCharacters(Row.NameStyle); Row.Title = RemoveSpecialCharacters(Row.Title); Row.FirstName = RemoveSpecialCharacters(Row.FirstName); Row.LastName = RemoveSpecialCharacters(Row.LastName); Row.CompanyName = RemoveSpecialCharacters(Row.CompanyName); Row.SalesPerson = RemoveSpecialCharacters(Row.SalesPerson); Row.EmailAddress = RemoveSpecialCharacters(Row.EmailAddress); Row.Phone = RemoveSpecialCharacters(Row.Phone); } public static string ...

DBMS

Database : database can be defined as a collection of information stored at a particular place written in a predetermined manner, belongs to a particular topic.  DBMS: (DATA BASE MANAGEMENT SYSTEM):  it is defined as software which is present inside the database which is used o maintain or manage the data within the database .                                 In the above diagram, if the user needs any info from the database. He has to contact with  dbms , then the dbms goes to the database, takes the information from the  database and gives it back to the user. Database models :  these models are broadly divided into the following types: 1.         FMS/FDMS 2.         HMS/HDMS 3.         NDBMS 4.         RDBMS 5.      ...

Different Schemas in DWH

  SCHEMAS:   Star Schema: -       Star Schema means Dimension Tables are directly linked to the Fact Table. Snow Flake Schema: -       Dimension Tables are indirectly linked to the Fact Table OR -       Dimension table is linking to another dimension table .       Star Schema   Snow Flake Schema 1 Has redundant data and hence tough to maintain/change 1 No redundancy and hence more easy to maintain/change 2 Less complex queries and hence easy to understand 2 More complex queries and hence less easy to understand 3 Less foreign keys so faster execution time of cube. 3 More foreign keys so take more execution of cube 4 Has De-Normalized Tables 4 Has Normalized Tabl...