Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.
Transcript of Cassandra
Scalability - Support 1M + easy scale (both ways)
High availability - During run, during upgrade
Low operation efforts
Data storage and access
The Cassandra Data Model
Introducing big data and Cassandra
Understand big data
Understand common use cases
Introducing Cassandra tools
What is cassandra?
Why not relational data?
Relational model provides
Normalized table schema
Cross table joins
But, at very high cost
Big data table joins – billions of rows, or more – require massive overhead
Maintaining indices on ever growing mutable tables has glass ceiling
Sharding tables across systems is complex and fragile
Modern applications have different priorities
Needs for speed and availability trump "always on" consistency
Commodity server racks trump massive high-end systems
Real world need for transactional guarantees is limited
Understand how requests are coordinated
Understand and tune consistency
Introduce anti-entropy operations
Understand how nodes communicate
Understand the system keyspace
CCM - Cluster manager
Nodetool - Cluster operations
CQLSH - CQL shell
CassandraCLI - Deprecated shell utility
Additional operation tools:
sstablekeys: output the partition keys in a SSTable
sstableloader: bulk load external data into a cluster
sstablescrub: used with nodetool repair to fix corrupted tables
sstable2json, json2sstable: export/import tools for data inspection
sstableupgrade: upgrade SSTables to current Cassandra version
sstablemetadata: display metadata information about a SSTable
sstablerepairedset: mark a SSTable as repaired
sstablesplit: split a large SSTable into smaller files
token-generator: generate tokens to manually assign to Cassandra nodes
What strategies help manage big data?
Distribute data across nodes
Relax consistency requirements
Relax schema requirements
Relax SQL support
Optimize data to suit actual needs
Immutable data persistency models
NoSQL for Time Series