Data Warehousing (original) (raw)

Last Updated : 24 Apr, 2026

Data warehousing is the process of collecting, integrating, storing and managing data from multiple sources in a central repository. It enables organizations to organize large volumes of current and historical data for efficient querying, analysis and reporting.

**Note: The main goal of data warehousing is to support decision-making by providing clean, consistent and timely access to data. It ensures fast data retrieval even when working with massive datasets.

data_warehouse

Data Warehouse Architecture

Need for Data Warehousing

Components of Data Warehouse

Types of Data Warehouses

The different types of Data Warehouses are:

  1. **Enterprise Data Warehouse (EDW): A centralized warehouse that stores data from across the organization for analysis and reporting.
  2. **Operational Data Store (ODS): Stores real-time operational data used for day-to-day operations, not for deep analytics.
  3. **Data Mart: A subset of a data warehouse, focusing on a specific business area or department.
  4. **Cloud Data Warehouse: A data warehouse hosted in the cloud, offering scalability and flexibility.
  5. **Big Data Warehouse: Designed to store vast amounts of unstructured and structured data for big data analysis.
  6. **Virtual Data Warehouse: Provides access to data from multiple sources without physically storing it.
  7. **Hybrid Data Warehouse: Combines on-premises and cloud-based storage to offer flexibility.
  8. **Real-time Data Warehouse: Designed to handle real-time data streaming and analysis for immediate insights.

Data Warehouse vs DBMS

Database Data Warehouse
A common Database is based on operational or transactional processing. Each operation is an indivisible transaction. A data Warehouse is based on analytical processing.
Generally, a Database stores current and up-to-date data which is used for daily operations. A Data Warehouse maintains historical data over time. Historical data is the data kept over years and can be used for trend analysis, make future predictions and decision support.
A database is generally application specific. A Data Warehouse is integrated generally at the organization level, by combining data from different databases
Example: A database stores related data, such as the student details in a school. Example: A data warehouse is a centralized repository that integrates data from multiple sources to enable efficient querying, analysis, and reporting, such as the best performing school in a city.
Constructing a Database is not so expensive. Constructing a Data Warehouse can be expensive.

Issues Occur while Building the Warehouse

1. When and How to Gather Data?

2. What Schema to Use?

3. Data Transformation and Cleansing

4. How to Propagate Updates?

5. What Data to Summarize?

Read more about Difficulties of Implementing Data Warehouses

Real world Example of Data warehousing

Data Warehousing can be applied anywhere where we have a huge amount of data and we want to see statistical results that help in decision making.

1. E-commerce: Flipkart

2. Banking: HDFC Bank

Advantages and Disadvantages of Data Warehousing

Advantages Disadvantages
Better Decisions: Centralized data supports faster, smarter decisions. High Cost: Setup requires major investment.
Business Intelligence: Enables strong operational insights. Complexity: Needs skilled professionals to manage.
High Data Quality: Ensures consistency and reliability. Time-Consuming: Long setup and integration time.
Scalable: Handles large and growing datasets. Integration Issues: Combining data from sources can be challenging.