Loading content…
Transcript

Snowflake Schema, Fact

Constellation Schema

Fact Constellation Schema

Introduction

Dimensions Modelling

Dimensions Modelling

Dimensions Modeling

Problem Statement

The multidimensional nature of business questions is reflected in the fact that,

- for example, marketing managers are no longer satisfied by asking simple one-dimensional questions such as

"How much revenue did the new product generate?"

Instead, they ask questions such as

"How much revenue did the new product generate by month, in the northeastern division, broken down by user demographic, by sales office, relative to the previous version of the product, compared the with plan?"

-A six-dimensional question.

Dimentional Modeling

  • Dimensional Modeling (DM) is a data structure technique optimized for data storage in a Data warehouse.
  • The purpose of dimensional modeling is to optimize the database for faster retrieval of data.
  • A dimensional model in data warehouse is designed to read, summarize, analyze numeric information like values, balances, counts, weights, etc. in a data warehouse.
  • In contrast, relation models are optimized for addition, updating and deletion of data in a real-time Online Transaction System.

Dimentional Modeling

Purpose:

The purpose of dimensional model is to improve performance by matching data structures to queries.

Characteristics:

The principal characteristic of a dimensional model is a set of detailed business facts surrounded by multiple dimensions that describes those facts, when realized in a database, the schema for a dimensional model contains a central fact table and multiple dimension tables.

Facts are the numbers that user analyse and summarise to gain a better understanding of the business

Fact Table

Fact Tables

Fact tables are used to record actual facts or measures in the business. Facts are the numeric data items that are of interest to the business.

Below are examples of facts for different industries:

  • Retail. Number of units sold, sales amount.
  • Telecommunications. Length of call in minutes, average number of calls.
  • Banking. Average daily balance, transaction amount.

Dimentional Table

Dimension Tables

Dimension tables, on the other hand, establish the context of the facts. Dimensional tables store fields that describe the facts.

Below are examples of dimensions for the same industries :

• Retail. Store name, store zip, product name, product category, day of week.

• Telecommunications. Call origin, call destination.

• Banking. Customer name, account number, data, branch, account officer.

Formation of the automaker sales fact table

Example

Snowflakes

What is a Snowflake Schema?

Snowflake Schema in data warehouse is a logical arrangement of tables in a multidimensional database such that the ER diagram resembles a snowflake shape.

A Snowflake Schema is an extension of a Star Schema, and it adds additional dimensions. The dimension tables are normalized which splits data into additional tables.

In the following Snowflake Schema example, Country is further normalized into an individual table.

SnowFlake Schema

Star

Schema

Star Schema

  • Star Schema in data warehouse, in which the center of the star can have one fact table and a number of associated dimension tables.

  • It is known as star schema as its structure resembles a star.

  • The Star Schema data model is the simplest type of Data Warehouse schema.

  • It is also known as Star Join Schema and is optimized for querying large data sets.

Star Vs SnowFlake Schema

The major difference between the snowflake and star schema models is that

  • The dimension tables of the snowflake model may be kept in normalized form to reduce redundancies.

  • Furthermore, the snowflake structure can reduce the effectiveness of browsing, since more joins will be needed to execute a query. Consequently, the system performance may be adversely impacted.

Hence, although the snowflake schema reduces redundancy, it is not as popular as the star schema in data warehouse design.

Star Vs Snowflakes

SnowFlake Schema

Star Schema

Characteristics of Snowflake Schema:

Characterstics

=>The main benefit of the snowflake schema

it uses smaller disk space.

=> Easier to implement a dimension is added

to the Schema

=> Due to multiple tables query

performance is reduced

=> The primary challenge that you will face

while using the snowflake Schema is that

you need to perform more maintenance

efforts because of the more lookup

tables.

Fact Constellation

A Fact Constellation Schema contains two fact table that share dimension tables between them.

It is also called Galaxy Schema.

Characteristics

  • Fact Constellation is a schema for representing multidimensional model.
  • It is a collection of multiple fact tables having some common dimension tables.
  • It can be viewed as a collection of several star schema and hence, also known as Galaxy schema.
  • It is one of the widely used schema for Data warehouse designing and it is much more complex than star and snowflake schema.
  • For complex systems, we require fact constellations.

Shared dimension tables between fact tables

  • There are two fact tables namely, Placement and Workshop which are part of two different star schemas having dimension tables – Company, Student and TPO in Star schema with fact table Placement and dimension tables – Training Institute, Student and TPO in Star schema with fact table Workshop.

  • Both the star schema have two dimension tables common and hence, forming a fact constellation or galaxy schema

SnowFlake vs Fact Schema

Group 10

Conclusion

1. Aditya Jha 07

2. Anuj Garg 15

3. Gaurav Kumar Gupta 24

Submtted to : Dr. Shweta Taneja