Introducing 

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

Loading…
Transcript

Facebook Distributed System Case Study

For Distributed System Inside Facebook Datacenters

System design enchancment:

Hadoop and Hive Integration

Hadoop Distributed File System (HDFS)

Designed for low-cost hardware with high fault tolerance.

Stores large data sets reliably and streams them at high bandwidth.

Dynamically scales across thousands of servers, ensuring economical growth.

Map-Reduce (M-R) Process:

system design enhancement in distributed systems aims to optimize the system's performance, reliability, and usability while accommodating changing needs and environments.

What are the features?

Hive: Data warehouse infrastructure built on Hadoop.

Enables easy data summarization, reporting, ad hoc querying, and analysis.

Provides HiveQL, a SQL-based query language for querying data stored in HDFS.

Significantly reduces time for job completion, making complex tasks manageable in minutes.

Introduction

Availability

Openness

Reliability

Communication

Handles terabytes and petabytes of data daily.

Two Major Phases: Map and Reduce.

Steps:

  • Mapper processes data blocks and generates key-value pairs.
  • Job tracker schedules Map-Reduce tasks to task trackers.
  • Reducer merges and sorts key-value pairs to generate final results.
  • Results stored in HDFS and replicated for easy access by clients.

Emphasizes high throughput over low latency, suitable for batch processing.

Utilizes a write-once-read-many access model for files.

Namespace exposed, user data stored in blocks across Data Nodes.

Single Name Node simplifies system architecture, managing all metadata.

Mem cached sever

A Memcached server is a distributed memory-caching system that stores data in memory as key-value pairs

Facebook's Reliance on Hadoop Platform

New App Requirements: Facebook adopts applications needing high write throughput, cost-effective storage, and low latency.

Architecture:

"What events are occurring on the server?"

Hadoop Integration

Apache Hive

4-Tier Architecture :

Hadoop is chosen for its ability to handle unstructured and structured data.

Ideal for data discovery and processing large volumes of data.

Two Main Components:

Map-Reduce (M-R): Computation

Hadoop Distributed File System (HDFS): Storage

Challenges that were faced:

How were they addressed?

Hadoop Architecture:

Master Node:

Job Tracker

Task Tracker

Name Node (NN)

Data Node (DN)

Worker Nodes

Facebook is recognized as the largest online social network system, and it is designed as a distributed system. The datacenters behind the Facebook network are robust, scalable, reliable, and secure, allowing Facebook to be highly accessible from anywhere with high availability. The system is responsible for processing large quantities of data, known as "Big Data," and it interconnects users through friendship relations, allowing for synchronous and asynchronous communications of user-generated content.

Tier 3:

Tier 4:

Tier 2:

Scalability and Management: Scaling MySQL clusters quickly presents challenges in load balancing and management overhead.

Tier 1:

Author name:

->MySQL and Memcached Usage

->MySQL Performance Issues

Web Servers

Scribe-Hadoop Clusters (Log Aggregation)

Hive-Hadoop Clusters (Data Warehousing)

Database (Federated MySQL)

Asma Salem

University of Jordan

Components of Distributed Systems

Alternatives:

Publication date:

Stores core user data and application data.

Facebook's centralized datacenters in the US (California: Santa Clara, Palo Alto, Ashburn) pose bandwidth and latency challenges for users outside the US

Facebook utilizes Content Delivery Networks (CDNs) geographically distributed across regions like Russia, Egypt, Sweden, and the UK

Process client requests.

Aggregate logs.

Facebook Distributed System Case Study For Distributed System Inside Facebook

Datacenters

Receive uncompressed logs from web servers.

Aggregate and store logs for analysis.

August 2017

Scalability: Ability to handle increasing workload efficiently.

Reliability: Ensuring system availability and fault tolerance.

Components:

Systems Design

Big Data Processing and Analysis

Huge Storage

Production Cluster:

High priority jobs with strict deadlines.

Stores and analyzes production data.

Ad-hoc Cluster:

Low priority batch jobs.

User-initiated ad-hoc analysis on historical data.

Affiliation:

INTERNATIONAL JOURNAL OF TECHNOLOGY ENHANCEMENTS AND EMERGING ENGINEERING RESEARCH, VOL 2, ISSUE 7 152

ISSN 2347-4289

  • Alternative solutions like TCP proxies and regional OSN caching servers are proposed to enhance network performance and reduce latency

  • TCP proxies allow users to be served by their regional server or through connections with original servers and CDNs

  • Regional OSN cache servers can serve requests entirely or with minimal interaction with original servers, improving system performance and scalability

Cloud Computing:

https://www.researchgate.net/publication/318946789

Cloud computing in distributed systems involves delivering computing resources and services over a network, enabling scalable and flexible access to resources without local hardware.

Scalability

Security

key points:

Hadoop RPC in Facebook's Distribution System:

Social Network:

>Virtual Org - Social networks act like online communities

for sharing resources.

>Social Cloud - used to share scalable data

>Service level agreements (SLAs)

>Cloud Hosts Social Networks - Cloud platforms can be the foundation of social networks.

>Facebook Apps Use Apps data to offer richer features.

>Isolated App Communication

(FBJS)

MORE FEATURES:

  • Challenges with Physical Servers: Facebook faces scalability limitations and risks with redundant cluster servers.

  • Transition to Cloud Computing: Cloud computing offers scalable solutions via on-demand resources and virtualization.

  • Benefits of Cloud Computing: Provides accessibility, scalability, and secure data navigation.
  • Impact on Data Centers: Cloud computing shifts computational power, with data centers managing remote infrastructure.

  • Facebook's Approach: Exploring cloud computing for scalability and integration amid growing requests.

  • Scalability Considerations: Crucial for meeting enterprise needs, dynamic scaling adjusts resources based on real-time indicators.
  • Hadoop servers use RPC for communication within Facebook's distribution system, optimizing real-time workload management.

  • RPC improves TCP connections, allowing efficient communication between servers and reducing timeouts.

  • RPC clients send pings to servers during timeouts, enabling continued communication if servers are responsive.

  • Hadoop's flexible RPC timeouts prevent delays, ensuring timely responses for real-time applications like Facebook Messaging.

  • The system employs a fail-fast approach, trying alternate servers if needed, enhancing reliability, and responsiveness. Additionally, Facebook Messaging integrates various features efficiently, leveraging Hadoop's communication capabilities.
Learn more about creating dynamic, engaging presentations with Prezi