Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Database Normalization

No description
by

Mirza Faraz Beg

on 26 October 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Database Normalization

What is it?
Organizing data in a way that it "behaves normally", i.e. it doesn't have "data anomalies".
How normal is normal?
Denormalization
In order to avoid the problems arising from too much decomposition, some systems introduce some level of denormalization, i.e.,
"less than the ideal level of normalization"
, usually that would mean settling for 3NF, which is good enough for most systems.
Database Normalization
Data Anomalies
Redundancy
Having the same data repeated in several rows. Redundancy causes two anomalies.
Update anomaly
Update repeating data in one row but not in others
Deletion anomaly
Deleting a row deletes all information, even the one that's needed.
Some systems introduce
Insertion anomaly
to deal with deletion anomaly, i.e. using the row as a placeholder for required information leaving all the other attributes as NULL values.

Decomposition
To break a "mega" relation into several smaller, more manageable relations.

Suppose there's a relation R. We can decompose R into two relations S and T in a way that:
R = S UNION T
S = PROJECTION R on attribute set A
T = PROJECTION R on attribute set B
Things to consider
Anomalies are truly eliminated
Anomalies in the original relation are removed.
No new anomalies are introduced in the new design
Decomposition is loss-less
Information in the original relation is fully recoverable by a query operation such as a join
Dependencies are preserved
When new relations are joined by query to form the original relation, the original FDs are preserved.
How to fix?
Normal Forms
1st: All attributes must be atomic
No repeating groups (e.g. a list)
Every resulting table must have a key
2nd: No partial dependency
Each attribute must be dependent upon the 'whole' key
3rd: Eliminate columns not dependent upon the key
Each attribute must be a fact about the key
Boyce-Codd: For every non-trivial FD, the LHS must be a superkey.
4th: No Independent Multi-valued dependency
In an attempt to normalize the table, we may decompose it "too much", i.e. it may lose some dependencies or it may cause serious 'impedance mismatch'.
Simply speaking, Impedance mismatch
is the set of problems that may arise
due the difference of storage and access
between the relational data model
and the object-oriented programming
model. In performance intensive systems
this problem may become all the more
severe because the time to retrieve
from the database can lead to
business loss.
Un-Normalized Data
1NF
2NF
3NF
3NF (Modified FD)
BCNF (Modified FD)
4NF
Each attribute is a fact about the key...
Each attribute is a fact about the key, the whole key...
Each attribute is a fact about the key, the whole key and nothing but the key!
Each attribute is a fact about the key, the whole key and nothing but the key!
1NF
2NF
3NF
BCNF
4NF
1 NF is the weakest to prevent redundancy but less joins are required, so read performance is higher.
4 NF is the strongest but many joins are required, so read performance is lower.
First there was an excel sheet
Full transcript