Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
You can change this under Settings & Account at any time.
Big Data Overview
Transcript of Big Data Overview
what is big data?
"big data is when the size of the data becomes part of the problem." (Roger Magoulas)
High Availability namenode
Distributed data storage
and computing on
The Hadoop Ecosystem
Text files have no schema
MapReduce is hard to learn
Need a place for ad-hoc analysis
Run SQL on Hadoop
(Data on HDFS)
partitioning - directory prefixes
Support for Indexes
MSCK REPAIR TABLE mytable
when partition is added
write once, read often
Optimized file formats
Managed vs External tables