Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.
Transcript of Third-party data
and behavior prediction depend on data available
of data is not necessarily key for prediction
Internal data sometimes insufficient :
may be useful to improve the explained variance of your model
Problem & Definitions
Where can data be coming from?
High-Level map of data sources
3rd party data : often understood as data
from commercial firms
3rd-party data can also be
≠ 2nd-party data (e.g. through DMP = Data Management Platform
3rd party data put in perspective
Pierre-Nicolas Schwab, RTBF (Belgium)
EBU Big Data Congress 22 March 2016
Real estate market is highly asymmetric in Belgium : sale prices not public (monopoly of notaries)
Pricing mechanisms opaque
Question : how to
predict real-estate prices
? Problem addressed by Realo (belgian startup).
Example 2 : Population statistics
Think out of the box
: new data can help improve your understanding of users' behaviors.
Don't be afraid
: new data is potentially everywhere ! Try new things with POC's.
: learn from your failures, improve. A failed POC may be the first step of a successful path.
: use the map to define your path to more and better data sources. Come after the conference (Linkedin invitations also welcome) to get
more information about our decision maps
Conclusions and recommendations
Different types of data :
Facebook connect : popular way to login
Broad range of information can be collected on user
Case 1 : TAM Airlines
Example 1 : Facebook data
Insurers : how to adapt pricing of car insurance (good vs. bad drivers)
old method: questionnaire (declarative data)
new method : actual data on driving behavior (observed)
Example 3 : from apps to iOT
Prioritize your data collection process
Realo built an algorithm to predict real-estate
Precision of model greatly improved by including open data (population statistics) provided by the State (= 3rd party)
Precision of Apps (1st party) < telematic units (sold and managed by 3rd party)
Data quality more
with Telematic units
Data currently collected does NOT improve significantly predictions
differs between countries (4m units in Italy vs. 30k in Belgium)
New variables need to be included in model (e.g. health-related with wearables)
Think first about the
underlying theoretical model
Identify the variables
that will maximize the explained variance
Public data is also 3rd-party data
Test different granularity levels (the most granular is not always the best)
+32 (0) 486 42 79 42
Mapping of all data sources to support decision process
: how much does the data help improve user's profiling in function of business objectives
: how intrusive is the data collection process
Longevity / Obsolescence
: how long will the data remain relevant ?
: how much does the data enrich your prior knowledge of users ?
Easy to collect (don't forget legal aspects though)
"Manual" work needed to aggregate data in higher-level exploitable categories ("centers of interest" variable recently removed from Facebook API)
Stability of Facebook API