Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.



No description

Atzmon Hen-tov

on 1 February 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Cassandra

SaaS challenges
Scalability - Support 1M + easy scale (both ways)
High availability - During run, during upgrade
Disaster recovery
Low operation efforts
Better price/TB
Data storage and access
The Cassandra Data Model
Introducing big data and Cassandra
Understand big data
Describe Cassandra
Understand common use cases
Introducing Cassandra tools
What is cassandra?
Why not relational data?
Relational model provides
Normalized table schema
Cross table joins
ACID compliance
But, at very high cost
Big data table joins – billions of rows, or more – require massive overhead
Maintaining indices on ever growing mutable tables has glass ceiling
Sharding tables across systems is complex and fragile
Modern applications have different priorities
Needs for speed and availability trump "always on" consistency
Commodity server racks trump massive high-end systems
Real world need for transactional guarantees is limited

Understand how requests are coordinated
Understand replication
Understand and tune consistency
Introduce anti-entropy operations
Understand how nodes communicate
Understand the system keyspace

Read path
Write path
CCM - Cluster manager
Nodetool - Cluster operations
CQLSH - CQL shell
CassandraCLI - Deprecated shell utility
Cassandra stress
Additional operation tools:
sstablekeys: output the partition keys in a SSTable
sstableloader: bulk load external data into a cluster
sstablescrub: used with nodetool repair to fix corrupted tables
sstable2json, json2sstable: export/import tools for data inspection
sstableupgrade: upgrade SSTables to current Cassandra version
sstablemetadata: display metadata information about a SSTable
sstablerepairedset: mark a SSTable as repaired
sstablesplit: split a large SSTable into smaller files
token-generator: generate tokens to manually assign to Cassandra nodes

Read path
Write path
What strategies help manage big data?
Distribute data across nodes
Relax consistency requirements
Relax schema requirements
Relax SQL support
Optimize data to suit actual needs
Immutable data persistency models
NoSQL for Time Series
Distribution models
Plyglot persistency
Read/Write path
Thank you
Full transcript