Introducing 

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

Loading…
Transcript

2

Physical Storage Media

1

4

BHAGWAN PARSHURAM INSTITUTE OF TECHNOLOGY

DATABASE MANAGEMENT SYSTEM

DATA BASE MANAGEMENT SYSTEM (DBMS)

Presentation on :

  • Physical Storage
  • File Organisation
  • Indexing &
  • Hashing

Submitted to:

Mrs. Mugdha 30/04/2020

DATABASE MANAGEMENT SYSTEM

Presentation

BY Group2 (CSE - A - 18) :

Aditya Jha 07

Chayan Sharma 21

Gaurav Kumar Gupta 24

Palaque Sharma 54

Physical Storage Media

1

Types

Classification of Physical Storage Media

Based on

  • Speed with which data can be accessed
  • Cost per unit of data
  • Reliability
  • Data loss on power failure or system crash
  • Physical failure of the storage device
  • On Content preservation
  • volatile
  • non volatile

Main Types

Optical storage

Magnetic-disk

Flash Drives

Non-volatile

Data is stored on spinning magnetic disk

Data survives power failure

Reads are roughly as fast as main memory

Data is read optically from a spinning disk, using a laser

Data must be moved from disk to main memory for access, and written back for storage

writes are slow (few microseconds)

Read and write speed is slow

Much slower access than main memory.

Accessing Speed

&

Cost per unit data

1.Storage Device Hierarchy(according to speed & cost)

Cache

Main Memory

Decreasing Access time

Flash Memory

Cost and Fastness increase's

Magnetic Disk

Optical Disk

Magnetic Tape

Databases are stored in file formats, which contain records. At physical level, the actual data is

stored in electromagnetic format on some device. These storage devices can be broadly

categorized into three type:-

Databases are stored in file formats, which contain records. At physical level, the actual data is stored in electromagnetic format on some device

Cache and Main Memory

Flash Memory Memory Disk

Optical Disk

Magnetic Tape

Tertiary Mem: used mainly for backups

hols large volume of data

Performance Measures

  • Access time
  • Data-transfer rate
  • Mean time to failure

RAID

RAID

RAID or Redundant Array of Independent Disks, is a technology to connect multiple secondary storage devices and use them as a single storage media.

Raid 0

  • The data is broken down into blocks and the blocks are distributed among disks.

  • It enhances the speed and performance of the storage device.

Raid 1

Advantages over Raid 0

  • Mirrored disks with block striping
  • Offers best write performance.

Raid 2

Advantages over Raid 1

  • RAID 2 records Error Correction Code using Hamming
  • Due to its complex structure and high cost, RAID 2 is not commercially available.

Raid 3

Advantages over Raid 2

  • A single parity bit is enough for error correction

  • This technique makes it to overcome single disk failures

To recover data in a damaged disk, compute XOR of bits from other disks (including parity bit disk)

Raid 4

Advantages over Raid 3

  • Level 3 uses byte-level striping, whereas level 4 uses block-level striping
  • Provides higher I/O rates for independent block reads than Level 3

Block stripping: an entire block of data is written onto data disks and then the parity is generated and stored on a different disk

Raid 5

Advantages over Raid 4

The parity bits generated for data block stripe are distributed, rather than on different disks.

RAID 6

Advantages over Raid 5

  • P+Q Redundancy scheme
  • Stores extra redundant information to guard against multiple disk failures.

File Organization

2

File Organization

Introduction

Introduction:

File:

A file is a collection of data stored in one unit, identified by a filename.

File Organization:

It refers to the logical relationships among various records

or

Means of identification and access to any required data

or

Storing the files in certain order is called file Organization.

1.Sequential File Organization

1.Pipe File Method:

1. Sequential File

Organization

New records are stored without breaking Sequence

Easiest Method:

Files are sorted one after another in sequential manner

New record Simply place at EOF

2.Sorted File Method:

New records initially inserted at end then sort the record

Two Different types:

1.Pipe File Method

2.Sorted File Method

Initially inserted at end

Records are sorted

Advantages

2.

Simple, Easy and Cheap Storage

Fast and efficient

(for large amount of data)

1.

Disadvantages

2.

Takes More time & space for storage

Time Wastage:As we have to follow sequence

1.

2.Heap File Organization

3.Heap File Organization

Advantages:

1. Faster fetching and retrieving of records

2. Best Suited when huge number of data need to be

loaded.

#1

#3

#2

Disadvantages:

1. Unused memory block problem

2. Inefficient for large data

Works with Data blocks:

1.Records are Inserted at the end of the

file into data blocks

2.No sorting is required

EXAMPLE:

New Records goes to vacant Blocks

3.B+ Tree File Organization

4.B+ Tree File Organization:

Advantages:

1.Tree traversal is easier and faster.

2. Easy to search records.

3. No restriction on B+ tree size

#1

#3

#2

It uses tree like structure to store records:

1.Key-Indexing: Primary key is use to sort

record

2.Similar to Binary Search Tree but can have more than two children.

Disadvantages:

1. Insufficient for static tables

EXAMPLE:

*As its leaf nodes is balanced, searching is easy

*From the leaf node at end we can traverse to all data.

4.Cluster File Organization

5.Cluster File Organization:

In this two or more related tables/records are stored within same file know as clusters

=>Lowers the cost of searching and retrieving various records in diff. files

2.Provides efficient result

1.Used when frequent joining of table required

Advantages of Cluster File Organization

Advantages

Low Performance for large database

Disadvantages of Cluster File Organization

Disadvantages

5.Hash File Organization

6.Hash File Organization:

In a hash file organization we obtain the bucket of a record directly from its search-key value using a hash function

Indexing

3

3

Indexing

Types

What is Index ?

What is Indexing ?

Index is a data structure used to quickly locate and access data.

Why Indexing ?

Indexing optimizes the performance of database.

How ?

It reduces the number of Disk Accesses required to execute a query.

Structure of Index

Pointer stores address of disk block.

Search Key - attribute to set of attributes used to look up records in a file

Two basic kinds of indices:

Types of Indices?

Ordered indices: keys are stored in sorted order

Hash indices: search keys are distributed uniformly across “buckets” using a “hash function”.

Ordered Indices

Primary Indices

Secondary Indices

Clustering Indices

Dense Index Files

Dense index — Index record appears for every search-key value in the file

Sparse Index Files

Sparse Index: contains index records for only some search-key values

Secondary Indices

Clustering Indices

Multilevel Index

If primary index does not fit in memory, access becomes expensive.

Solution: treat primary index kept on disk as a sequential file and construct a sparse index on it

4 Hashing

Hashing

What is Hashing ?

Why Hashing is required

For a huge database structure, it can be almost next to impossible to search all the index values

through all its level and then reach the destination data block to retrieve the desired data.

Hashing

is an effective technique to calculate the direct location of a data record on the disk without using

index structure.

A bucket is a unit of storage containing one or more records (a bucket is typically a disk block).

Hash Organization

Bucket

Hash Function

Example of Hash File Organization

There are 10 buckets,

The binary representation of the ith character is assumed to be the integer i.

The hash function returns the sum of the binary representations of the characters modulo 10

E.g.

h(Music) = 1 h(History) = 2

h(Physics) = 3 h(Elec. Eng.) = 3

Hash Functions

An ideal hash function is uniform, i.e., each bucket is assigned the same number of search-key values from the set of all possible values

Ideal hash function is random, so each bucket will have the same number of records assigned to it irrespective of the actual distribution of search-key values in the file

Bucket Overflow

Bucket overflow can occur because of

Insufficient buckets

Skew in distribution of records.

This can occur due to two reasons:

multiple records have same search-key value

chosen hash function produces non-uniform distribution of key values

Handling of Bucket Overflows (Cont.)

Overflow chaining – the overflow buckets of a given bucket are chained together in a linked list.

Types of Hashing

Static Hashing

Dynamic Hashing

Static Hashing

In static hashing, when a search-key value is provided, the hash function always computes the same address

Deficiencies of Static Hashing

One solution: periodic re-organization of the file with a new hash function

Expensive, disrupts normal operations

Better solution: allow the number of buckets to be modified dynamically

If initial number of buckets is too small, and file grows, performance will degrade due to too much overflows.

If space is allocated for anticipated growth, a significant amount of space will be wasted initially (and buckets will be underfull).

If database shrinks, again space will be wasted.

Dynamic Hashing /

Extended Hashing

Good for database that grows and shrinks in size.

Allows the hash function to be modified dynamically

Hash Indices

Hash Indices

A hash index organizes the search keys, with their associated record pointers, into a hash file structure

Hashing can be used not only for file organization, but also for index-structure creation.

Strictly speaking, hash indices are always secondary indices

PREVIOUS YEARS QUESTIONS

Previous Year Questions

Questions

THANK YOU

Thank you

Learn more about creating dynamic, engaging presentations with Prezi