Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Google File System

No description

Juan Palacios

on 6 February 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Google File System

Hardware Failure is the Norm Write Once Read Many Files are Huge Architecture Core Concepts TB/PB/EB The system is built from many inexpensive commodity components that often fail. We expect a few million files, each typically 100 MB or larger in size. Multi-GB files are the common case and should be managed efficiently. Small files must be supported, but we need not optimize for them. The workloads primarily consist of two kinds of reads: large streaming reads and small random reads. The workloads also have many large, sequential writes that append data to files.

Once written, files are seldom modified again.

Small writes at arbitrary positions in a file are supported but do not have to be efficient. The system must efficiently implement well-defined semantics for multiple clients that concurrently append to the same file. Atomicity with minimal synchronization overhead is essential. Provide a familiar file system interface Files are organized hierarchically in directories and identified by pathnames Snapshot operation to create a copy of a file or a directory tree at low cost Record append operation to allow multiple clients to append data to the same file concurrently while guaranteeing the atomicity of each individual client’s append Expected to run on commodity Linux machines running a user-level server process Single master and multiple chunkserver architecture Master maintains all filesystem metadata and coordinates system-wide activities Chunkservers store chunks of data on local disks as Linux files Files are divided into fixed-size chunks Data is replicated by a configured factor (x3 by default) Clients interact with the master for metadata operations, but all data-bearing communications go directly to the chunkservers
Full transcript