Data Warehousing

In a nutshell, a data warehouse is a “repository for historical data, used to make decision”. Of course there’s much more to it than that.

Data warehousing is a concept. It is a set of hardware and software components that can be used to better analyze the massive amounts of data that companies are accumulating to make better business decisions. Data Warehousing is not just data in the data warehouse, but also the architecture and tools to collect, query, analyze and present information.

Operational data is the data you use to run your business. This data is what is typically stored, retrieved, and updated by your Online Transactional Processing (OLTP) system

Informational data is created from the wealth of operational data that exists in your business and some external data useful to analyze your business. Informational data is what makes up a data warehouse. Informational data is typically:

  • Summarized operational data
  • De-normalized and replicated data
  • Infrequently updated from the operational systems
  • Optimized for decision support applications
  • Possibly “read only” (no updates allowed)
  • Stored on separate systems to lessen impact on operational systems

Creating the informational data, that is, the data warehouse, from the operational systems is a key part of the overall data warehousing solution. Building the informational database is done with the use of transformation or propagation tools.

These tools not only move the data from multiple operational systems, but often manipulate the data into a more appropriate format for the warehouse. This could mean:

  • The creation of new fields that are derived from existing operational data
  • Summarizing data to the most appropriate level needed for analysis
  • Denormalizing the data for performance purposes
  • Cleansing of the data to ensure that integrity is preserved.

The heart of the data warehouse is the dimensional (a.k.a. star or snowflake) schema: a central fact table surrounded by dimension and transform tables.

Think of the fact table as the business process you are measuring, such as a sale or shipment.

 

Leave a Reply