Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Prezentacija Big Data

An Analysis

vladimir todorovic

on 11 June 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Prezentacija Big Data

Big data analytics is the process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations and other useful information. Such information can provide competitive advantages over rival organizations and result in business benefits, such as more effective marketing and increased revenue.
It has increased the demand of information management specialists in that IBM, Microsoft, Facebook, HP have spent more than $15 billion on software firms only specializing in data management and analytics. In 2010, this industry on its own was worth more than $100 billion and was growing at almost 10 percent a year: about twice as fast as the software business as a whole.
The primary goal of big data analytics is to help companies make better business decisions by enabling data scientists and other users to analyze huge volumes of transaction data as well as other data sources that may be left untapped by conventional business intelligence programs.
The Big Data Landscape includes over 100 companies and Big Data vendors of all sizes, public and private market investors, and technology buyers.
Data mining is sorting through data to identify patterns and establish relationships.
Data mining parameters include:
1)Association - looking for patterns where one event is connected to another event
2)Sequence or path analysis - looking for patterns where one event leads to another later event
3)Classification - looking for new patterns
4)Clustering - finding and visually documenting groups of facts not previously known
5)Forecasting - discovering patterns in data that can lead to reasonable predictions about the future

Data mining techniques are used in a many research areas, including mathematics, cybernetics, genetics and marketing. Web mining takes advantage of the huge amount of information gathered by a Web site to look for patterns in user behavior.
Predictive Analytics
This is the branch of data mining concerned with the prediction of future probabilities and trends. The central element of predictive analytics is the predictor, a variable that can be measured for an individual or other entity to predict future behavior.
Multiple predictors are combined into a predictive model, which can be used to forecast future probabilities with an acceptable level of reliability. In predictive modeling, data is collected, a statistical model is formulated, predictions are made and the model is validated.
Big data analytics can be done with the software tools commonly used as part of advanced analytics disciplines such as predictive analytics
What is Big Data?
“A massive volume of both structured and unstructured data that is so large that it’s difficult to process with traditional database and software techniques.”
“Data is a new class of economic asset, like currency and gold.”
Source: World Economic Forum 2012
Wiki - In information technology, big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools.
Gartner - "Big Data are high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization"
It was developed because the internet and data mining became so complex and integrated into our lifestyles that we needed innovative technologies to manage our big data.
Velocity: input
2012 year: 200 GB/day
2016 year: 28 TB/day
Sloan Digital Sky Survey
Produced more data in it's first few weeks than the entire history of Astronomy preceding it.
Velocity: output
150 million sensors delivering data 40 million times per second.

99.999% of data is being filtered.
Large Hadron Collider
If they would not filter all the unnecessary data, it would be 500 exabytes per day.
Big data is any type of data
*Social Change
*User Experience
*Public Health
*International Development
Our Daily lives leave traces in the digital world
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
Márton Mestyán, Taha Yasseri, János Kertész
page views
unique users
312 movies, US Market. 2010
Page views

Unique Editors

Number of Theaters
to Predict:
First weekend box office revenue
Linear Regression Model
30 days before release
Multiple Linear Regression Modelling
Microsoft PhotoDNA & Facebook
How Facebook could get you arrested - The Guardian
In 2011 Facebook began using PhotoDNA, a Microsoft service that allows it to scan every uploaded picture and compare it with child-porn images from the FBI's National Crime Information Centre. Since then it has expanded its analysis beyond pictures as well. In mid-2012 Reuters reported on how Facebook, armed with its predictive algorithms, apprehended a middle-aged man chatting about sex with a 13-year-old girl, arranging to meet her the day after. The police contacted the teen, took over her computer, and caught the man.
Oakland in California. Like many other American cities, today it is covered with hundreds of hidden microphones and sensors, part of a system known as ShotSpotter, which not only alerts the police to the sound of gunshots but also triangulates their location. On verifying that the noises are actual gunshots, a human operator then informs the police.

Instead of detecting gunshots, new and smarter systems can focus on detecting the sounds that have preceded gunshots in the past. This is where the techniques and ideologies of big data make another appearance, promising that a greater, deeper analysis of data about past crimes, combined with sophisticated algorithms, can predict – and prevent – future ones.
How Facebook could get you arrested - The Guardian
Relation between data mining and big data
Big Data deals with the flow of information, and having the ability to analyse this data in real time (or near-real-time). Data mining is more of the traditional searching the haystack for needles.

Data Mining is an analytic process designed to explore data (usually large amounts of data - typically business or market related - also known as "big data") in search of consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. The ultimate goal of data mining is prediction - and predictive data mining is the most common type of data mining and one that has the most direct business applications. The process of data mining consists of three stages: (1) the initial exploration, (2) model building or pattern identification with validation/verification, and (3) deployment (i.e., the application of the model to new data in order to generate predictions).
•Lower risk by detecting fraud and monitor cyber security in real time: By collecting information from new types of sources such as social media, emails, sensors all together.

•Enable businesses to gain a full understanding of their customers: For example, what they buy, why they buy, what they prefer, why they switch and actually what they'll buying next!!:

According to Gary Spakes: There are four ways Big Data can benefit businesses:
Detect, prevent and remediate financial fraud.
Calculate risk on large portfolios.
Execute high-value marketing campaigns.
Improve delinquent collections.
T-Mobile USA has integrated Big Data across multiple IT systems to combine customer transaction and interactions data in order to better predict customer defections. By leveraging social media data (Big Data) along with transaction data from CRM and Billing systems, T-Mobile USA has been able to “cut customer defections in half in a single quarter”.

McLaren’s Formula One racing team uses real-time car sensor data during car races, identifies issues with its racing cars using predictive analytics and takes corrective actions pro-actively before it’s too late!

Dr Kotadia, Harish (2012)
Where does Big data come in?
Big data is a concept created for processing the enourmous amounts of data
Big data helps to solve challenges such as:
Large portion of the data is just noise
Big data separates noise from the useful information
"Normal" database management systems aren't enough
Full transcript