Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

High Availability

No description
by

Reham Hussam

on 12 October 2014

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of High Availability

#principles_of_high_availability_engineering
#Servers_Redundancy
#Network_Redundancy
like comment share
1.
Elimination of single points of failure.
This means adding
redundancy

to the system so that failure of a component does not mean failure of the entire system.

2.
Reliable crossover.
In multithreaded systems, the crossover point itself tends to become a single point of failure. High availability engineering must provide for reliable crossover.

3.
Detection of failures as they occur.
If the two principles above are observed, then a user may never see a failure. But the maintenance activity must.
#Hardware_Redundancy
like comment share
#Multiple_Powersupplies
#Supervisor_Redundancy

#What_is_HA_and_measuring_it ?

Availability is the degree to which an application, service, or functionality is available upon user demand. Availability is measured by the perception of an application's end user.
End users experience frustration when their data is unavailable, and they do not understand or care to differentiate between the complex components of an overall solution. Performance failures due to higher than expected usage create the same havoc as the failure of critical components in the solution.

The definition of availability is:

Availability = 𝑢𝑝𝑡𝑖𝑚𝑒/(𝑢𝑝𝑡𝑖𝑚𝑒+ 𝑑𝑜𝑚𝑛𝑡𝑖𝑚𝑒)




#Redundancy_Modes
#Cloud_For_High_Availability
like comment share
-Route Processor Redundancy
#partial_booting (initialization) #save_IOS_only
-Route Processor Redundancy Plus
#booting_init_routeengin #no_L2_L3_Functions
#no_down_modules
-Statefull_switch_over
#full_operating #full_sync.

#Mass_Storage
-Many systems today need to store many terabytes of data

-Don’t want to use single, large disk
* too expensive
* failures could be catastrophic

-Would prefer to use many smaller disks

#RAID
-Redundant Array of independent Disks

-Basic idea is to connect multiple disks together to provide

* large storage capacity
* faster access to reading data
* redundant data

-Many different levels of RAID systems
differing levels of redundancy, error checking, capacity, and cost

#Striping
-Take file data and map it to different disks

-Allows for reading data in parallel

#Mirroring
-Keep to copies of data on two separate disks
-Gives good error recovery
* if some data is lost, get it from the other source
-Expensive
* requires twice as many disks
-Write performance can be slow
* have to write data to two different spots
-Read performance is enhanced
*can read data from file in parallel



#RAID_0

Disk Striping with No Redundancy
* High Performance; Low Availability
* Data Striped on Multiple Disks
* Multi-threaded Access
#RAID_Level-1
- A complete file is stored on a single disk
- A second disk contains an exact copy of the file
- Provides complete redundancy of data
-Read performance can be improved
* file data can be read in parallel
- Write performance suffers
* must write the data out twice
- Most expensive RAID implementation
* requires twice as much storage space

#RAID_1/0
Striped Mirrors

- Highest Performance; Most Expensive Availability
- Multi-threaded Access

#RAID_5
Disk Striping with rotating parity drive

- High Read Performance, expensive Write performance; Cheap Availability
- Tuneable Stripe granularity
- Optimized for multi-thread access

#RAID_5 Striping

- Chunk size is tuned such that typical IO aligns on single disk.
- Parity rotates amongst disks to avoid write bottleneck

-Nodes monitor health of other nodes

-If that node fails health monitoring will cause ‘failover’ of the resource

- Anotheer node starts the application and reads the last saved information from the shared storage

#Server_Clustering
Server clustering allows each server to act on its own instead of requiring one server to wait on another, while using one common redundant mass storage device.
In a basic server cluster; two servers would share one RAID array. If one server goes down, the second takes over while still providing its own workload, in such scenario the failover time is greatly reduced because each server is already having the information.

What does #Network_Redundancy means?
The goal of redundancy is to prevent or recover from the failure of a specific component or system.

There are three protocols that are used in network redundancy : HSRP & VRRP&GLBP

#HSRP
- Hot Standby Router Protocol provides network redundancy , it increases network uptime .

- It uses multicast address 224.0.0.2

- The protocol consists of virtual MAC address and IP address that are shared between two or more routers that belong to the same HSRP group.

- HSRP group numbers vary from 0 to 255 .

- HSRP can use authentication between the routers in a group ( same password && same authentication type ) .

- Is a cisco proprietary (only used on cisco devices) .

-HSRP provide redundancy for gateway router, when the active router goes down, another router can replace it.



#HSRP_Terms
Active router
: The router that is currently forwarding packets for the virtual router .

Standby router
: The primary backup router .

Listen router
: The router that is not active or standby
How #HSRP works?
- One router will be in Active mode , another router will be in Standby mode and others will be in Listen mode .

- These routers are in one group called Standby group .

- HSRP makes this group as if it was a virtual router that has an IP address and MAC address .

- The user uses the IP address of this virtual router as its default gateway .

- That IP is configured normally as we all know , but the MAC address is made through certain rule :

for ex. : MAC is 12.32.58.AB.8C.xx
Where xx is the Standby group number


How #routers detect if there is failure ?
- HSRP uses the multicast address 224.0.0.2

- Active router sends Hello message periodically every 3 sec as multicast .

- If the other routers do not receive this Hello message for 3*Hello time(10 sec by default) , then this means there is failure .



What happens when #active_router goes down :
- In case of failure , the Standby router will take over the responsibility and become the Active router .

-Now what about the new Standby router ?
* one of the routers in Listen mode that has
the highest priority will be the Standby
router , where the priority values vary
from 0 to 255 & default value is 100 .

* If two routers have the same priority ,
then the router with the higher ip address
will be the Standby router .




#VRRP
- Virtual Router Redundancy Protocol is a protocol that achieve network redundancy .

- It uses multicast address 224.0.0.18

- It is the same as HSRP but they differ in only one thing that : HSRP is a Cisco proprietary but VRRF is standard .

#GLBP
- The Gateway Load Balancing Protocols

- GLBP provides load balancing over multiple routers using a single virtual IP address and multiple virtual MAC addresses.

- Each host is configured with the same virtual IP address, and all routers in the virtual router group participate in forwarding packets (ARP replies of all client)

- All clients has the same virtual IP as a gateway
one router is elected as AVG (active virtual gateway)

- AVG answer all ARP request and which Mac address it return depends on balancing algorithm

- Virtual routers called AVF (active virtual forwarders)

- Other routers in the group acts as backup routers for AVF routers

- If AVG fail , the next highest priority of AVF in the group is assigned as AVG

- If AVF fail , AVG assign AVF role to another backup router

GLBP can achieve load balancing using one the three methods

round-robin

Weighted

host-dependent








How does it #work?
After the GLBP group is established, the PC_Client 1 and PC_Client 2 send to the AVG router an ARP request.













#High_Availability #Low_cost
The AVG, in this case Ciscozine_1 because it has the higher priority (150), responds to the ARP request with an ARP reply to the PC clients using the round-robin method:

PC_Client 1 receives the ARP 0007.b400.0101
PC_Client 1 receives the ARP 0007.b400.0102

PC_Client A and PC_Client B have each resolved a different MAC address for the default gateway, so they send their routed traffic to separate routers, although they both have the same default gateway address configured. Each GLBP router is an AVF for the virtual MAC address to which it has been assigned
#High_availability_over_network
By Cloudians :
1- Amr Dewidar
2- Ahmed Abdellatif
3-Ahmed Alaa
4-Adel Mohammed
5-Reham Hussam
6- Mohamed Adel Zidan
7-Mohamed Salah ElDin
8-Mahmoud Abdrahman
9-Wael Murad
10-Galal Abdel Fatah
11-Mohamed Abdallah
12-Kirollos Akrm
Full transcript