Friday 14 October 2011

Data Warehouse Concepts


  • A DWH is a historical database because the database contains many years of historical business data for Decision making purpose.
  • A data warehouse is a repository (collection of resources that can be accessed to retrieve information) of an organization's electronically stored data, designed to facilitate reporting and analysis. In simple form data warehouse is a collection of large amount of data.
  • A DWH system is designed to read the business data for business analysis processing but not for Transactional processing. Hence it is called as a Read only database.
  • A DWH is designed to take the decision. Hence it is also known as DSS (Decision Supportive System).

1.   The fathers of DWH are W.H. Inmon & Ralph Kimball. W.H.Inmon defined the DWH as
Time Variant, Non Volatile, Integrated and Subject Oriented.
                                I.Time Variant:
                        In order to discover trends in business, analysts need large amounts of data. This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. A data warehouse's focus on change over time is what is meant by the term time variant.
                          A business user can analyze the business data in the warehouse to the different time periods like Year, Quarter, Month, and Weeks etc.
                              II. Non Volatile:
                        Nonvolatile means that, once entered into the warehouse, data should not change. This is logical because the purpose of a warehouse is to enable you to analyze what has occurred. The data that is present in the DWH is Static.
                            III. Integrated:
                        Data warehouses must put data from disparate sources into a consistent format. They must resolve such problems as naming conflicts and inconsistencies among units of measure. When they achieve this, they are said to be integrated.
                            IV. Subject Oriented:
                        Data warehouses are designed to help you analyze data. For example, to learn more about your company's sales data, you can build a warehouse that concentrates on sales. Using this warehouse, you can answer questions like "Who was our best customer for this item last year?" This ability to define a data warehouse by subject matter, sales in this case makes the data warehouse subject oriented.
2. 
1. Staging Area:
   - A storage area and set of process that clean, transform, combine, removing duplicate, archive and prepare source data for use in data warehouse.
   -  It accepts data from different sources.
   - The structure is closer to the Operational Systems rather than the DW.
   - Data arriving at different point of time is merged and then loaded into the DW.
   - Usually does not maintain history; only a temporary area.

2.  Types of DWH approach:-
                         I. Top – Down approach:
                                                    According to W.H.Inmon, we need to develop the enterprise DWH system and then from the EDW develop subject oriented databases called Datamarts according to the business needs.
                              II. Bottom – Up approach:
                                                   According to Ralph Kimball, 1st develop the datamarts according to business needs and then integrate all datamarts into EDW.
3

No comments:

Post a Comment

SSIS: Creating Package Configurations

This post discusses the creation of Configuration Files and how they can be useful while migrating a package from one environment to an...