Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


Final Big Data strategy

No description

Mittah Raditlhalo

on 5 December 2016

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Final Big Data strategy

Big Data from a statistical view

Enhance the reach and use of statistical information
Ensure that the privacy of individuals is protected
Maintain public trust in the integrity of the office
Continue to drive costs lower and be resource efficient through effective partnering with stakeholders
To become one of the world leaders in the utilization of Big Data for statistical production
By aligning ourselves and collaborating with the front runners in statistical modernization, we will ensure world class best practice in our Big Data research, development and application for statistical process improvement
Big data
Volume (Size)
Velocity (speed)
Structured data
(numeric in traditional databases)
Semi-structured data
(may contain tags or other meta-data to organise it)
Unstructured data
(e.g. text documents, emails, videos)

Maintain momentum on data improvements, foster regular engagement between private, public & community levels

Global Partnership for Sustainable Development Data (GPSSD) could promote several initiatives:

World forums for sustainable development
User forums to ensure feedback loops
Partnerships & coordination for data sharing
SDGs analysis & visualization platforms
To support the data revolution for sustainable development a proposal should be built on the following pillars:

Private sector participation (leverage resources and creativity of public sector)
Capacity development
Global data literacy (investment to increase global data literacy)
A comprehensive strategy towards a new Global Consensus on Data needs to be developed to enable cooperation, including:

Development & adoption of specific principles related to the data revolution
Accelerate the development and adoption of legal, technical, geospatial & statistical standards
An establishment of a Network of data innovation networks is recommended including:

Leverage emerging data sources for SDG monitoring (i.e data from private sector)
Develop systems for global data sharing (e.g Roambi)
Fill research gaps (i.e. engage research centres, innovators and govts. in the development of available data analytics tools

The Revolution of Data
To assist in the exchange of information and promote and protect the rights of individuals by:
promoting & adopting specific principles related to the data revolution
Accelerate the development and adoption of legal, technical, geospatial and statistical standards (transparency in the exchange of data and metadata)

SWOT for Big Data in Stats SA
Reduce response
burden for

Reduce cost of
data collection

The Big Data promise to the strategy...
Increase in information = more accurate statistical results
Requires investment
The African Charter
Promote & protect human and people's rights through equality, freedom, justice & dignity. These cannot be dissiociated from economic & social rights
Agenda 2063
Aspires a prosperous African continent based on inclusive growth and sustainable development
Aims to build a strong, united continent that is an influencial global player & partner
Promotes good governance & democracy
Why Big Data?
The MTSF...
Links to SDGs
Big Data promises to...
Provide easier and more accurate to access and is readily available
Data & technological infrastructure that requires investment but will be cheaper in the long run
Near-real time data for more accurate and relevant statistics
Institutional stability.
Authorised by law to collect data.
Products comply with high standards.
Being in control of large survey projects.
Trained staff members are already close to the "data science" skills.
The culture may be less tuned to Big Data era.
Long and slow programming and budget cycles.
No or little control over total budget to invest in Big Data.
lacking methods for providing reliable official staistics based on Big Data sources.
New scalable infrastructure required.
Insufficient human capacity to work with Big Data as compared to private sector.

OSC organizations are trusted third parties.
Information based on Big Data may be certified by OSC organisation.
Big Data attracts young minds.
OSC being out-competed by other big data actors.
OSC organisations may be perceived to loose relevance.
Continued investment in "Legacy Systems".
The Statistical community face great challenges, to achieve the 17 SDGs, statistics need to be transformed to meet data demands.

The current Stats SA Strategy aims to:
Expand its statistical information base
Develop new and innovative statistical products & services
Revolutionize data systems
Priority to expand, modernize and increase the affordability and accessibility of information.
There's a growing demand for statistics data of all geographic levels, that is specific and evidence-based to inform sustainable planning and development of SA
The National Development Plan
SA needs to sharpen its innovative edge and contribute to global scientific & technological advancements:
Need for investment in research & development and better use of existing resources
Enhance cooperation between public science & technology institutions & private sector (to facilitate innovation)
All South Africans should be able to acquire & use information and be able to use it (transparency)
To manage information, communications & technology environment need to be better structured to ensure SA does not fall victim to "digital divide"
International examples, local applications
The High Level Group
Price Scanner Data
For Stats SA...
After the introduction of scanner data, response increased to 20%
Mainly collected by using web questionnaires
Questionnaire response was 13%
Statistics South Africa in 2015 has registered a scanner data project.
Explorative project, with the aim of determining the suitability of using scanner data in South Africa.
StatsSA needs to be prepared for transmission errors from one database to another that may occur (could affect statistics analysis and results)
Satellite Imagery for Agricultural Statistics
Agriculture statistics in Australia
Satellite imagery & drones used for data collection (Remote Sensing).
Pursuing classification of satellite data at the crop type level
...in South Africa
for Stats SA...
Detailed data on sales of consumer goods, obtained by scanning the bar codes for individual products at electronic points of sale in retail outlets.
Statistics Sweden started data collection late 2011. It was done in parallel with the traditional collection method (Paper/web questionnaire).
FTP account (site) established for data transmission channel.
Tourism & Migration Statistics
Mobile positioning data for travel, tourism and population statistics
In South Africa...
Simseek is an application that tracks any device that uses a sim card.
How Stats SA can use this data
Statistics Estonia use Location based services to:
Study the human movements in, out and around Estonia
Get real time access to data in the field
Track inbound & overnight travels at smaller scale
Estimate the population
Social Media
How Social Media can be used:
To keep track of current affairs through trend analysis.
Sentiments/ people's behaviour (i.e. towards economic changes).
9 in 10 people access internet every day. (94% of the 16.9 million population). 8.8 million active social media users.
Twitter in Netherlands is used by 3.5 million active users and 1.5 million daily users. (2013)
In 2014 Statistics Netherlands started to investigate the potential of using twitter messages for official statistics.
Dutch twitter messages were studied from two perspectives: Content and Sentiment.
The sentiment in messages was found to be highly correlated with consumer confidence. (Sentiments regarding the economic situation).
Paper written by Sangita Dubey & Pietro Gennari (2014): Now-casting Food Consumer Price Indexes With Big Data: Public-Private Complimentaries
The Billion Prices Project
Started as an academic research in 2006 by MIT (Massachusetts Institute of Technology).
To study inflation and pricing behaviour of online items globally.
Collect online prices by the use of web-scrapping.
Research findings were presented at the 2014 U.N conference on Big Data for official statistics
Retailers that trade online
There's potential use of the data however only a small population use online trading.
The possibilities...
BPP Explained...
Online and official prices indexes
The results....
To oversee development of frameworks, sharing of information, tools & methods and coordinates work relating to the use of big data for purposes of official statistics.
Was set up by the Bureau of the Conference of European Statisticians in 2010.
Scanner data for calculating Consumer Price Index
In Estonia, mobile positioning data has been used since 2006.
By tracking the location of mobile devices geographically.
By the use of built-in GPS and phone data (Transactions, Call & sms activities, mobile antennas).
Research project, started in 2014
Monitors the movement of mobile phones.
Potential source of census data, migration patterns, tourism and travel statistics.
Census data
Travel and tourism
Migration data
Ongoing research, started in 2013
To explore the possibility of using satellite imagery data for agriculture and crop area statistics.
Social Media as a potential data source for official statistics
National Landcover dataset 2014.
72 classes (landcover & landuse).
Cultivation type analysis of agricultural fields and the area for each type.
Time series analysis (2 seasonal mosaics).
Geography division, EA polygon boundaries (Area size)
Landcover change detection.
Agricultural statistics to include small-scale & subsistence agriculture.
Advanced methods, tools & infrastructure to represent, store, manipulate, integrate & analyse complex data
Specific Actions:
Enhance data integration methods, tools & infrastructure
Modernise statistical practice for non-traditional data sources
Introduce new approaches to data modelling & analysis
Evaluate & deploy high performance computing platforms
A diverse pool of government, private & open data sources available for statistical purposes
Specific actions:
Facilitate the sharing of private data for public good
Trail the targeted use of external data provision services
Safe and appropriate public access to microdata sets & statistical solutions derived from an array of data sources
Specific action:
Develop microdata access solutions & lead national adoption of privacy preserving data analytics
A skilled workforce to be able to intepret information needs & communicate the insights gathered from rich data
Specific action:
Build & Share competency in data science
Strong multidisciplinary partnerships across government, industry, academia & the statistical community
Specific actions:
Support & levarage external system development initiatives
Establish a broad collaboration network to advance specific Big Data initiatives
2014 Initiatives
Transactional sources (from banks/ telecommunication providers/ retail outlets)
Sensor data sources.
Social network source, image/video-based sources.
2015 Initiatives
Satellite images
Trade data
Social data from twitter
Enterprise websites
Mobile data
Motive for exploring Scanner Data
Focus on low response burden
Improving data quality
Rapidly changing prices and more complex pricing structures.
Consumer behaviour is changing
Increasing availability of electronically data
Scanner data has been introduced step-by-step in the Norwegian CPI.
Statistics Norway has a policy not to pay for any data used in official statistics in accordaance with the Statistics Act of 1989.
Full transcript