### slides

```Collaborative Data Analysis
and Multi-Agent Systems
Robert W. Thomas
CSCE 824
15 APR 2013
Agenda
•
•
•
•
Problem Description
Existing Research Overview
Limitation of Existing Results
Future Research Suggestions
2
Problem Description
• Divide and Conquer; Reconcile
• Recommender Systems and Social Media
– Content Filtering
– Collaborative Filtering
– Collaborative Data Analysis through Agents
3
Content Filtering
• Recommendations based on items similar to
what has been preferred previously
4
Collaborative Filtering (CF)
• Recommendations based on what others in a
network prefer
• Different Techniques
– Memory-Based
– Model-Based
– Hybrid
5
Memory-Based CF
• Similarity Computation
• Prediction and Recommendation Computation
• Top-N Recommendations
6
Similarity Computation
• Compares Users or Items
• Correlation-Based (Pearson correlation)
• , =
• , =
∈(, − )(, − )
∈(, − )
2
2
∈(, − )
∈(, − )(, − )
2
∈(, − )
• Vector Cosine-Based
• , = cos ,  =
∙
∗
2
∈(, − )
Two users: u,v
Two items: i,j
∈ = items both u and v have
rated
= avg rating of co-rated
items of the ℎ user
∈ = users who rated both i
and j
= avg rating of the  ℎ item
by those users
R = m x n user-item matrix
,  are n dimensional vectors
corresponding to i and j
column of R
7
Prediction and Recommendation
Computation
• Weighted Sum of Others’ Ratings
– , =  +
∈(
, − , )
∈

• Simple Weighted Average
– , =
∈ , ,
∈
,
Prediction P for active user a,
on item i
= avg rating of user u
, = weight between user a
and user u
∈ = users who rated item i
Prediction P for user u on item i
∈ = all other rated items
for user u
, = weight between items i
and n
, = rating for user u on item n
8
Top-N Recommendations
• Item-Based
• User-Based
9
Model-Based CF
•
•
•
•
•
Bayesian Belief Net
Clustering
Regression-Based
Markov Decision Process (MDP) –Based
Latent Semantic
10
Bayesian Belief Net
• Bayesian logic – decision making and inferential statistics
• Simple Bayesian
– Memory-Based
–  = arg
max
∈
( )
(
=  | )
– Laplace Estimator to avoid a conditional probability of 0
–   =  |  =  =
#( = ,=)+1
#(=)+
• Tree Augmented naïve Bayes and naïve Bayes optimized by
Extended Logic Regression (ELR)
– Require extended training periods to produce results beyond
simple Bayesian and Pearson correlation
11
Clustering
• Cluster: collection of similar objects, dissimilar
to objects in other clusters
– Pearson correlation can be used
• Three Categories
– Partitioning
– Density-based
– Hierarchal
• Often an Intermediate Step
12
Regression-Based
• Use approximation of ratings to make
predictions against a regression model
• Apply to situations where rating vectors have
large Euclidean distances but very high
Similarity Computation scores
13
MDP-Based
• Sequential Optimization Problem
• <S,A,R,Pr>
– S = {states}
– A = {actions}
– R = {rewards} for r(s,a,s’)
– Pr = {transition probabilities} for pr(s,a,s’)
• Partially Observable MDP (POMDP)
14
Latent Semantic
• Uses statistical modeling to discover
additional communities or profiles
15
Network Trust
• Opinions of different contacts are valued more
than others under certain conditions
• Accounting for this can increase CF accuracy
• Semantic Knowledge
• Social Tie-Strength
16
Hybrid CF
• CF + Content-Based
• CF + CF
• CF + CF and/or Content-Based
17
Limitations of Existing Solutions
•
•
•
•
•
•
•
•
Time / Accuracy Trade Offs
Noisy Data
Data Sparsity (New User)
Scalability
Synonymy
Gray Sheep
Shilling Attacks
Privacy
18
Future Research Suggestions
•
•
•
•
Hybrids
Semantics
Trust
Parallel Processing
– Multi-Agent Systems
19
BACKUP
20
References
• Su, Xiaoyuan, and Taghi M. Khoshgoftaar. "A survey of
collaborative filtering techniques." Advances in
Artificial Intelligence 2009 (2009): 4.
• Chen, Wei, and Simon Fong. "Social network
collaborative filtering framework and online trust
factors: a case study on Facebook." Digital Information
Management (ICDIM), 2010 Fifth International
Conference on. IEEE, 2010.
• O'Donovan, John, and Barry Smyth. "Trust in
recommender systems." Proceedings of the 10th
international conference on Intelligent user interfaces.
ACM, 2005.
21
```