Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Recommender systems (S523)
Transcript of Recommender systems (S523)
Issues and Concerns
Our World is an over-crowded place
Recommender systems have become
extremely common in recent years.
A few examples of such systems:
Pandora Radio takes an initial input of a song or musician and plays music with similar characteristics (based on a series of keywords attributed to the given artist or music). The stations created by Pandora can be refined through user feedback (emphasizing or deemphasizing certain characteristics).
When viewing a product on Amazon.com, the store will recommend additional items based on what other shoppers bought along with the currently selected item.
35% sales at Amazon
are from recommendations
Netflix offers predictions of movies that a user might like to watch based on the user's previous ratings and watching habits (as compared to the behavior of other users), also taking into account the characteristics (such as the genre) of the film.
75% of videos watched by Netflix
users is suggested by recommender system
Different types of RecSys
Matrix Factorization (
You many have watched thousands of movies, but perhaps I can describe these movies using
are enough to describe
Likewise, "Stargate" has been watched by millions people, but perhaps
are enough to describe
Magic: the hidden aspects (both for item and user) can be discovered automatically by Matrix Factorization
It does not require large user groups to achieve reasonable recommendation accuracy.
New items can be immediately recommended once item attributes are available.
Pandora Radio is a popular example of a content-based recommender system that plays music with similar characteristics to that of a song provided by the user as an initial seed.
There are also a large number of content-based recommender systems aimed at providing movie recommendations, a few such examples include Rotten Tomatoes, Internet Movie Database, Jinni, Rovi Corporation and See This Next.
–User cold start : new users
–Item cold start : new items
How to recommend items to new users?
Give non-personalized recommendations
- Most popular items
- Highly Rated items
Use user's register profile (Age, Gender, …)
User Cold Start
Item Cold Start
1) Registration of profiles
2) Detection and Analysis
Privacy is considered as a very important aspect in e-commerce as there is legislation on the distribution of private information of users to third parties.
To assure privacy of customers and the fact that the company is not liable for any lawsuits, steps need to be taken to assure that privacy related material cannot be subtracted from statistics and other used material in the recommender system.
The total number of items is extremely large, however, even the most active users will only have consumed a small subset of the overall database.
E.g., 1% of the ratings are known in Netflixt dataset
In real world systems, there are millions of users and products. Thus, a large amount of computation power is often necessary to calculate recommendations
Jingjing Zhang, PhD
We are overloaded
But we really need and consume only a few of them!
Can Google Help?
Can Facebook Help?
Can Experts Help?
Yes, but only when we really know what we are looking for
What if I just want some interesting music tracks?
- Btw, what does it mean by “interesting”?
Yes, I tend to find my friends’ stuffs interesting
What if I had only few friends, and what they like do not always attract me?
Yes, but it won’t scale well
Everyone receives the same advice!
It is what they like, not me!
- Like movies, what get expert approval does not guarantee attention of the mass
We need Help!
What do recommender
systems do, exactly?
RS are software agents that elicit the
interests and preferences of individual consumers
[…] and make recommendations accordingly. They have the potential to
support and improve
the quality of the decisions consumer
s make while searching for and selecting products online.
(Xiao & Benbasat, 2007)
Use the opinions of
of users to help
in that community to identify more effectively
content of interest
from a potentially overwhelming set of choices.
(Resnick & Varian, 1997)
Any system that produces
as outputs or has the effect of guiding the user in a persoanlzied way to
objects in a large space of possible options.
Find good items
: presenting a ranked list of recommendations
Find all good items
: identify all items that might be interesting
Recommend sequence of items
: sequence of related items is recommended
Annotation in context
: predict usefulness of an item that user is currently viewing
and many more...
and Many (!) More applications
Online mates (Dating services)
Future friends (Social network sites)
Courses in e-learning
Taxonomy of Recommender Systems
User-based method (
Many people liked "Music and Lyrics"
Can you tell how much I like it?
The idea is to find my "friends" who share
with me, then how much I like depend on how much THEY liked.
User similarity: Pearson correlation, Cosine coefficient, etc.
Basic idea: you may like it because your “friends” liked it
Item-based method (
Based on so many (good & bad) movies that I watched in the past, would you recommend me watching “Taken”?
The idea is to find my
previously watched movies
that share similar audience with “Taken”, then how much I will like depend on how much I liked
Item similarity: Pearson correlation, Cosine, etc.
Basic idea: I tend to like this movie because I have liked those similar movies
because people who have watched those movies also liked this movie (Amazon implementation)
User-based Collaborative Method
Item-based Collaborative Method
Non-Personalized Summary Statistics
External community data
Best seller list, most popular, trending items
Summary of community ratings
Highest rated, most liked
Non-personalized recommendations are fast to compute and easy to understand. These methods can also be used to overcome some of the common problems in recommender systems such as cold start and the sparsity problem
Yelp restaurant ratings
Billboard music rankings
iTune featured mobile apps
TripAdvisor hotel ratings
Amazon best seller rankings
The New York Times best sellers
It will predict the rating of “Music and Lyrics” for Alice as 9, because that’s how Christine and David (most similar users to Alice) liked this movie.
predict the rating of “Music and Lyrics” for Alice using the ratings of most similar users
Have you ever used
Grouplens? Amazon? Netflix? Google news? CNN? Spotify? Tivo? YouTube? Pandora?
words, hyperlinks, images, tags, comments, titles, URL, topic
genre, rhythm, melody, harmony, lyrics, meta data, artists, bands, press releases, expert reviews, loudness, energy, time, spectrum, duration, frequency, pitch, key, mode, mood, style, tempo
age, sex, job, location, time, income, education, language, family status, hobbies, general interests, Web usage, computer usage, fan club membership, opinion, comments, tags, mobile usage
Item Descriptions & User Profiles
What is Content?
Can we acquire content pieces automatically?
Fairly easy for text
Difficult for music and video, except for digital signals, e.g. music genre classification 60-80% accuracy
A lot of noise, e.g. misplaced tags
What can we do with these contents?
Compute similarity between items or users
Query items that are similar to a given item
Match item’s content and user’s profile
How do we know the recommendation is good?
Evaluation criteria (Herlocker et al., 2004)
Mostly measured by Accuracy
Closeness between predicted rating and actual user rating
e.g., MAE, RMSE, Precision, Recall
Predictive accuracy isn't enough
Other important metrics:
Coverage, diversity, novelty, robustness, profitability, etc
Consistency/Stability (Zhang, 2012)
Recommender Systems in Academia
Early days: 3 papers by HCI researchers (1995)
ACM RecSys Conference: one of the top conferences in machine learning
Involves many disciplines: CS/IS, marketing, statistics, MS/OR
Netflix $1M Prize Competition
Yelp dataset challenge
Spotify RecSys challenge
YouTube video understanding challenge
To recommend to us something we may like
- It may not be popular
- Based on our history of using services
- Based on other people like us
OK, here is the idea called Recommender Systems !
How to recommend new items to user?
Do not recommend:
Using content information:
attack is inserted to decrease the number of times an item is recommended to different users.
attacks increase the number of times an item is recommended
attack (or shilling attack): the goal is to create a bias in the recommender system by inserting fake user ratings.
Why Recommender System?
Objective of Recommender Systems
Taxonomy of Recommender System
Evaluation of Recommendation
Issues and Concerns
About This Module
Introduce the concepts of recommender systems
Provide a high-level overview of recommendation technologies used in real-world applications (e.g., Amazon, Netflix, Pandora)
Discuss the issues and challenges in recommender systems
Session 1 presents an introduction to recommender systems
Session 2 discusses popular recommendation algorithms and emerging research issues
Agenda for Session I
Jingjing Zhang, Ph.D.
Assistant Professor of Information Systems, Indiana University
Undergraduate (2005): Information Systems
Master (2007): Computer Science
PhD (2012): Information Systems
Personalization and recommender systems
Data mining, machine learning, knowledge discovery
Human-computer interaction, behavioral decision making
Teaching experience (courses taught at IU)
: data management, business analytics
: data mining, big data technologies
: data mining and personalization