i290_280I_Lecture_6c

Report
Recommender
Systems
Recommender Systems


In many cases, users are
faced with a wealth of
products and information
from which they can
choose.
To alleviate this many web
sites help users by using
Recommender Systems,


List of items or page that are
likely to interest them
Once the user makes a
choice, a new list can be
presented
What Data is used to make the recommendations?

Explicit feedback

Ratings
 Reviews
 Auctions
• Implicit feedback

Page visits
 Purchase data
 Browsing paths
What are the type of recommendations?

Item-to-Item associations

Similar pages this
 “Users who bought this book also bought X”

User-to-User associations


Which other user has similar interests?
User-to-Item associations
 Rating
history describes user
 Items are described by attributes
 Items are described by ratings of other users
Classification of Recommender Systems

Content-based approach
 Item


is described by a set of attributes
Movies: e.g director, genre, year, actors
Documents: bag-of-words
 Similarity
metric defines relationship between
items


e.g. cosine similarity
Examples


“related pages” in search engine
Google News
Related Approaches
Mooney and Roy (2000)
 Their
approach comes from the Information
Retrieval (IR) field
 They rely on the content of the items, and use
some similarity score to match the items
based on their content

Burke (2000)
 The
use the content-based recommendation.
 However, they allow to the user introduce
explicit information about his preferences.
Types of Recommender Systems

Collaborative filtering
 Item
is described by user
interactions
 Matrix V of n (number of
users) rows and m (number
of items) columns
 Elements of matrix V are the
user feedback
 Examples:



Rating given to item by each
user
Users who viewed this item
Similarity metric between
items
Related Approaches
Collaborative Filtering
 They
used historical data gathered from other
users to make the recommendation

Ex: If a user wants to rent a movie, he tends to rely
on friends to recommend him items that they have
like it
 The
goal is to identify those users whose
taste in recommendations is predictive of the
taste of a certain person and use this
recommendations to construct an interesting
list for the user.
Collaborative Filtering Models

Memory Based
 Neighborhood
 Latent

Models
Factors
Model Based
 Classification
 Bayesian
Networks
 Association Rules
Memory Based Approaches
 Works
directly with the user data
 Given a user, the system finds the most
similar users to make a recommendation
 There
are two approaches:
Neighborhood
 Latent Factor

Neighborhood Approach




It’s an item-oriented approach, focusing on evaluating the
preference of a user to an item based on ratings of similar
items by the same user.
Users are transformed to item space by viewing them as
baskets of rated items. No longer to compare users to items,
but directly relate items to items.
Pros: rely on a few significant neighborhood relations;
effective at detecting very localized relationships
Cons: ignore the vast majority of ratings by a user; unable to
capture the totality of weak signals in all of a user’s rating.
Latent
Factor
Models
 Transform both items and users to the same latent



1
3
5
2
4
2
5
4
1
4
4
1
factor space, thus making them directly comparable.
Latent space tries to explain ratings by characterizing
both products and users on factors automatically
inferred from user feedback.
Pros: effective at estimating overall structure that
relates simultaneously to most or all items.
Cons: poor at detecting strong association among a
small set of closely related items.
3
4
2
3
5
3
5
4
3
4
4
2
4
2
1
3
5
~
2
2
2
3
4
5
.1
-.4
.2
-.5
.6
.5
-.2
.3
.5
1.1
2.1
.3
-.7
2.1
-2
-1
.7
.3
1.1
-.2
.3
.5
-2
-.5
.8
-.4
.3
1.4
2.4
-.9
-.8
.7
.5
1.4
.3
-1
1.4
2.9
-.7
1.2
-.1
1.3
2.1
-.4
.6
1.7
2.4
.9
-.3
.4
.8
.7
-.6
.1
Singular Value Decomposition

Decompose ratings matrix, R, into coefficients matrix US
and factors matrix V such that
2
N
M

J  D  U K SKVK    Dij  U K SKVKT 
i 1 j 1



is minimized.
U = eigenvectors of RRT (NxN)
V = eigenvectors of RTR (MxM)
S = diag(1,…,M) eigenvalues of RRT
k
 r11

R
r
 N1
r1M   w11
 

rNM   wN 1
w1k  v11


wNk 
 vk 1
ij

M
v1M 


vNM 
Challenges Collaborative Filtering
User Cold-Start problem
not enough known about new user to decide who
is similar (and perhaps no other users yet..)
Challenges Collaborative Filtering

Sparsity
when recommending from a large item set,
users will have rated only some of the items
(makes it hard to find similar users)
Challenges Collaborative Filtering

Scalability
with millions of users and items, computations
become slow

Item Cold-Start problem
Cannot predict ratings for new item till some
similar users have rated it [No problem for content-based]
Related Approaches
Srebro & Jaakkola (2003)
Weighted SVD
N
M

J  D  U K SKVK   Wij Dij  U K S KVKT 
ij

Binary weights
i 1 j 1

2
 wij
= 1 means element is observed
 wij = 0 means element is missing

Positive weights
 weights
are inversely proportional to noise variance
 allow for sampling density e.g. elements are actually
sample averages from counties or districts
Related Approaches
SVD with Missing Values
Uses Expectation maximization to calculate the
approximation of matrix
E
step fills in missing values of ranking matrix with
the low-rank approximation matrix
 M step computes best approximation matrix in
Frobenius norm
 Local minima exist for weighted SVD
Related Approaches
Agarwal (2009)
Regression-Based
Latent Factor Models
They presented a regression
based factor model that
regularizes and deals with
both cold-start and warmstart in a single framework.
It takes advantage of other user ratings, item and
user features to predict the missing ratings
Model Based Approaches
 User
data is compressed into a predictive
model
 Instead of using ratings directly,
develop a model of user ratings
 Use the model to predict ratings for new items
 To build the model:
Bayesian network (probabilistic)
 Clustering (classification)
 Rule-based approaches (e.g., association rules
between co-purchased items)

Related Approaches
Stern(2009)
Large Scale Online Bayesian
Recommender




Integrates Collaborative
Filtering with Content
information.
Users and items compared
in the same space.
Flexible feedback model.
Bayesian probabilistic
approach.
Value of the Recommendation
Many considerations are taken into account
to build the list of recommendations:
 The
likelihood of a recommendation to been
accepted by the user
 The immediate value to the site
 The long term implications of the
recommendations on the user’s future choices
Value of the Recommendation
Example:
Suggest a video camera with probability 0.5
or a VCR with a probability 0.6
 To
recommend the video camera is less
profitable than the VCR
 It the long term it might be more profitable
(the camera has accessories that are likely to
be purchased whereas the VCR does not)
Sequential Nature of
Recommendation Process
The recommender system suggests items to the user
The user can accept or not one the items offered
A new list of items is calculated based on the user past ratings
Markov Decision Process (MDP)

A MDP is a model for stochastic decision
problems
 A MDP
is a four-tuple (S,A,Rwd, tr) where S is a set of
states, A is a set of actions, Rwd is the reward
associated with each state/action and tr is the
transition function for each state.


The goal is to behave in order to maximize the
total reward
The optimal solution π is a policy specifying
which action to perform in each state .
Markov Decision Process (MDP)
The value function V of the policy π is
defined as:
Where γ is a discount factor
And the optimal value function V* is defined
as:
Markov Decision Process (MDP)

To find the optimal policy π* and its
corresponding value function V*:
 We
search the space of the possible policies starting
with an initial policy π0(s)
 At each step we compute the value function based on
the former policy and update the policy based on the
new value function
Temporal Dynamics in the
Recommendations

Item-side effects:
 Product
perception and popularity are
constantly changing
 Seasonal patterns influence items’ popularity

User-side effects:
 Customers
ever redefine their taste
 Transient, short-term bias; anchoring
 Drifting rating scale
 Change of rater within household
Temporal dynamics - challenges



Multiple sources: Both items and users are changing
over time
Multiple targets: Each user/item forms a unique time
series
 Scarce data per target
Inter-related targets: Signal needs to be shared among
users – foundation of collaborative filtering
 cannot isolate multiple problems
Time Sensitive Recommenders
Koren (2009)
Collaborative Filtering with Temporal Dynamics

He use factor models to separate
different aspects of the ratings to observe
changes in:
Rating scale of individual users
2. Popularity of individual items
3. User preferences
1.
Recommender Systems with Social Networks
Use the interaction of the user with others
to do recommendations
 Motivation:

 Social
Influence: users adopt the behavior of
their friends

Challenges:
 How
do we define influence between users?
Recommender Systems with Social Networks
Preliminary Approaches
Jamali & Ester (2009)
TrustWalker: A Random Walk Model
for Combining Trust-based and Itembased Recommendation
 Explores
the trust network
to find Raters.
 Aggregate the ratings from
raters for prediction.
 Different weights for users
Open Challenges

Transparency




Exploration versus Exploitation





Convince a user to accept a recommendation
Help a user make a good decision
Help a user fit a goal or mood
Cold start problems (for new items, and for new users)
Choosing what questions to ask users
Trade-off between optimizing for this user vs. for all users
How can meta-data on user or item help?
Guided Navigation


Providing a guide over a vast body of content
User's intent detection
Open Challenges

Time Value
 Does
value of user input decay with time?
 Do items change in relevance with time?
 How to adjust for recent user experience?



Evaluation of the recommenders performance
Scalability
Combining Content and Collaborative
Recommenders efficiently

similar documents