Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
You can change this under Settings & Account at any time.
Statistics powered by Hadoop
Transcript of Statistics powered by Hadoop
powered by Hadoop
We are talking about...
18 000 000 Data points
and 4 Gigabytes
... per day!
We calculate about...
12 values for each of our 2M classifieds
15 values for each of our 40K Dealers
... so we have over 24.5M results
Lets do it...
01:00:00 2012-05-13 pv 212769979 mobil.autoscout24.fr d3fc0700-caf7-4a2c-95a1-11f8b1407e7c
Anonymous user identifier
The data is stored in Hive for structure and partitioned by date and hour for fast access
Aggregation for classifieds and save in Hive
Create oracle table and export classified result to oracle
Aggregation for customers and save in Hive
Again create oracle table and export
Tell the rest of the world, that job is done
This example were only 2 hours
Our nightly job does a whole day in 15 minutes