Skip to main content

SSIS_Incremental Loading


Types of Data Loading:

1.     Full Data Loading:

2.     Incremental or Delta or Partial Loading

Solutions for Incremental Loading:

1.     Use Lookup

2.     Use Slowly Changing Dimension Transformation(SCD):

o   This is used to capture history of the row.

o   This is used in Data Warehouse (OLAP)

o   There are 3 types of SCD

§  Type 0 SCD: Changes treated as errors (Nobody can change the data, like Name, Gender, and DOB. If it is changed by anybody it will give error)

§  Type 1 SCD: It will over write the row (If any changes will happen in the source, the changes will be overwrite in destination.)

§  Type 2 SCD: It will capture history of a row (If any changes happen in source, the same row get copied as new row in destination.)

o   Note:

o   OLTP system cannot store history of a row.

o   SCD will support Insert and update but it will not support Delete.

o   Surrogate Key: A sequence number in the DWH system.

o   SQL Server 2012 has Change Data Capture (CDC) to store history of OLTP data.

                 


3.     Use Merge Statements in SQL(Recommended):

·         To perform incremental loading

·         It is one of the DML statement placed in SQL Server 2008.

·         Merge statement is a combination statement that can perform INSERT, UPDATE , DELETE statements based on whether rows that match the selection criteria exists in the target table or not.

Syntax:

MERGE Target_Table as target

USING Source_Table as source

ON condition

WHEN MATCHED THEN

UPDATE

SET columns=source.column,....

WHEN TARGET NOT MATCHED THEN

INSERT VALUES (source.column1, source.column2,)

4.     Use Customized SQL code/SSIS Code

5.     Use Set Operators (Except, Intersect, Union etc.)

Q. What is a pre-requisite to perform an incremental loading?

=> Source system should have a primary key. 

Popular posts from this blog

DBMS

Database : database can be defined as a collection of information stored at a particular place written in a predetermined manner, belongs to a particular topic.  DBMS: (DATA BASE MANAGEMENT SYSTEM):  it is defined as software which is present inside the database which is used o maintain or manage the data within the database .                                 In the above diagram, if the user needs any info from the database. He has to contact with  dbms , then the dbms goes to the database, takes the information from the  database and gives it back to the user. Database models :  these models are broadly divided into the following types: 1.         FMS/FDMS 2.         HMS/HDMS 3.         NDBMS 4.         RDBMS 5.      ...

Introduction to Datawarehouse

In shot what is DWH (Data warehouse)? A DWH is a Large Database which consists of Dimension and Fact tables. Or Collection of DataMart is called as a DWH. Father of Data Warehouse: Dr. William H. Inmon Dr. Ralph Kimball -Introduced Dimension Modelling to design DWH in 1990. What is Data Warehouse? As per Dr. William Inmon, Data Warehouse is a -       Subject Oriented (Loans, Mutual Funds etc.) -       Integrated (Get data from various sources) -       Time Variant (Different time period data)   ·          *Datamart is Subject Oriented ·          *DWH is a collection of Datamart ·          *DWH is a superset of Datamart ·          *Datamart is also another database. Top-down Approach: (DWH->Datamarts)   ...

Different Schemas in DWH

  SCHEMAS:   Star Schema: -       Star Schema means Dimension Tables are directly linked to the Fact Table. Snow Flake Schema: -       Dimension Tables are indirectly linked to the Fact Table OR -       Dimension table is linking to another dimension table .       Star Schema   Snow Flake Schema 1 Has redundant data and hence tough to maintain/change 1 No redundancy and hence more easy to maintain/change 2 Less complex queries and hence easy to understand 2 More complex queries and hence less easy to understand 3 Less foreign keys so faster execution time of cube. 3 More foreign keys so take more execution of cube 4 Has De-Normalized Tables 4 Has Normalized Tabl...