Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Big Data and Social Media

A presentation about using unstructured data from social sources for business
by

Sarah Smith-Robbins

on 23 June 2015

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Big Data and Social Media

!
BIG DATA
from
SOCIAL MEDIA

Sarah Smith-Robbins, PhD
sabsmith@indiana.edu
WHY IS
SOCIAL DATA
IMPORTANT?
USES
IN
BUSINESS
DATA
FORMS

TEXT
METADATA
NETWORK
MECHANICS
OTHER
MEDIA
FORMS

INTERACTIVE
MODELS
Event
Detection
Sentiment
Analysis
Evaluating
Reviews
Predictive
Analysis
Competitive Intelligence
Open
Innovation
COMPLICATIONS
ANALYSIS PROCESSES
Representation
Knowledge
Discovery
Bag of
Words
(BOW)
Vector
Space
Modeling
Classification/
Categorization
Clustering
Joint
Mining
It's massive
It's readily available
But it's unstructured!
1. Interconnected data about individuals
2. User generated content, listening and clipping services
3. Data Analytics: Deep and Broad
3 Waves of Data
WARNING:
Correlation vs Causation
1) Strong association/correlation
2) Consistent directionality
3) No spurious variables or competing hypotheses
NEWS
POLITICAL EVENT
EMERGING URGENT TREND
MOOD
DISPOSITION
SATISFACTION
ACCURACY
SUMMARIES
INFLUENCE
TREND FORECASTING
SALES MODELING
ANTICIPATING SUPPLY
Strategic Intelligence: Long Term
Tactical Intelligence: Short Term
Marketing Information
Pricing
Promotion
Competitor news events
CROWDSOURCING
INSIGHT GATHERING
Lingo
Burstiness
Lack of context: Conversation, intent, network mechanics
Lack of stable conventions
Language
Semantic gaps: "Dark Knight" vs Batman
word frequency
exclusion
Wordle.com
colocation
common phrases
grammatical relationships
types of content
purpose of conversation
political/entertainment/lifestyle
sentiment :
http://www.csc.ncsu.edu/faculty/healey/tweet_viz/tweet_app/
Nodes by topic, theme,
or meme
truthy.indiana.edu
Comparing known/structured
data or proven models to
unstructured sets
Time: rate, speed, discreet time stamps
Device/platform info
Tags
Links: destination, user/spread/share
User: account, identity ("real" vs signaled"
Location: RFID, GPS
Attributes:
multi-dimensional networks
user and information representation
dynamic network structures
Connections:
degree
velocity
reach
Mechanic markers: RT, @ etc
Photo, video, audio
innacurate tags
inability to index
Ex. Social Games
cost/benefit analysis
network relationships
lifestyle, use of free time
Sampling Errors: sampled data isn't representative of population
Echo: multiple posts of same content by same people, repeated content
Coverage: sampling in the wrong places or not enough places
Nonresponse bias: only studying those willing to respond
Full transcript