Location-Aware News Feed - Department of Computer Science and

Report
University of Minnesota
GeoFeed: A Location-Aware
News Feed System
Jie Bao
Mohamed F. Mokbel
Chi-Yin Chow
Department of Computer Science and Engineering
University of Minnesota – Twin Cities
Department of Computer Science
City University of Hong Kong
Background
Social Networking
Services
(e.g., Facebook & Twitter)
Become one of the most popular Web services!!!
2
What is News Feeds?
■
News Feed function

■
Display a set of messages/news from friends / subscribed news agents
Examples:

Social networking system, i.e., Facebook, Twitter

News Aggregators, i.e., My Yahoo!, iGoogle
3
Motivation
■
Traditional News Feed
 Organized by either message issuing time, e.g., Twitter, or
some user requirements, e.g., Facebook
 Spatial relevance is overlooked, user gets the same news feed
from different log on locations
■
Motivating Scenarios
 Travelling user is more interested in the news/messages that
are close to her current location to explore the new place
 Stationary users may NOT be interested in the news/messages
that are issued very far from their locations
If the news feed functionality is aware of the inherent
locations of users and messages, more relevant news
feed will be delivered
4
“Locations” in Existing Social
Networking Systems
Google Latitude
Facebook Place
Twitter Nearby
“Real” Location-Aware News Feed
1. Social Relevance
Messages from friends/ subscribed news agents
2. Spatial Relevance
■
Message relevant
the user’s
locatioawareness
n
Unfortunately
notto“real”
location
currently
 Share only user’s current location, e.g., Google Latitude
 Use location information as a tag , e.g., Facebook Place
 View all the messages in a spatial range, e.g., Twitter Nearby
5
Location-Aware News Feeds
■
Location-Based Messages



Issuer: user/ news agent
Spatial extent: point/range
Location-Aware News Feeds

Recent k spatial relevant
messages from each of my
friends
Example:
Carol wants her news feed from friends (Alice and Bob)
Alice’s Messages
Message
Content
Spatial
Timestamp
M5
Raining
S5
14:30
M3
A nice bar
S3
14:10
M2
Eating at bar
S2
14:04
A location-based query is issued to retrieve the
most recent k=2 relevant messages from Alice
M2
M6 M
5
Carol
M3
M4
M1
Bob’s Messages
Message
Content
Spatial
Timestamp
M6
Local Sale
S6
15:30
M4
An accident
S4
14:21
M1
Work finished
S1
11:40
A location-based query is issued to retrieve the
most recent k=2 relevant messages from Bob
6
An Overview of GeoFeed
■
For a user U with N friends, GeoFeed abstracts location-aware
news feed to a set of N location-based queries, such that:


■
GeoFeed employs three approaches for each location-based query



■
The N location-based queries are fired upon U logging on to the system
Each location- based query is directed to one friend to retrieve the set of k relevant
messages
Spatial Pull approach
Spatial Push approach
Shared Push approach
GeoFeed employs a decision model that decides upon the best
approach to evaluate each query such that:


The system computational overhead is minimized
Each user U will get the required news feed in TU time units
7
GeoFeed Preliminary :
Problem Formulation
■ Given:




■
User location
User friend list
User response time requirement
User activity patterns, i.e., offline time and update frequency
Find:
 Best approach among spatial pull, spatial push, and shared
push approaches, to evaluate q once u logs on to the system
next time
■
Objective:
 Provide location-aware news feed for the user
 Guarantee a the response time that u will encounter to get all
the requested location-aware news feeds
 Minimize the computational overhead for all queries in the
system
8
The Spatial Pull Approach in GeoFeed
■
Spatial Pull approach
 Do nothing when the user offline
 Once the user logs on, compute al the queries for the user
Bob
Alice
1. location-based
query
3. Get cell
2. Alice’s
location
5. Relevant
messages
Spatial
Filter
Messages
Grid Index
4. Messages in the cell
 Advantage: No extra overhead during offline period
 Disadvantages: High user response time and not efficient for the
user with short offline time
9
The Spatial Push Approach in GeoFeed
■
Spatial Push approach
 Maintain a materialized view for the pre-computed messages
 Once the user logs on, the answer is ready
1. location-based
query
Alice
Other
Friends
2. Relevant
messages
Bob
3. Range
query
Materialized
view
New
message
Other
Materialized
views
4.Update
Grid Index
 Advantage: Users are very happy with very low response time
 Disadvantages: System is overwhelmed with maintaining large
number of views that may not be necessary
10
The Shard Push Approach in GeoFeed
■
Shared Push approach
 Share one view among queries for the nearby friends
 Once the user logs on, the answer is ready
Alice
1. location-based
query
Filter
Nearby
Friends
Bob
2. Relevant
messages
3. Range
query
Shared
materialized
view
4.Update
New
message
Grid Index
 Advantages: Users are still very happy with very low response
time, and system overhead could be significantly lower
 Disadvantages: Users need to be close enough, continuously
check if views can be shared
11
GeoFeed Cost Model
■
Spatial pull approach (based on per user-friend evaluation)

■
Spatial push approach (based on per user-friend evaluation)


■
Response time
 Evaluating the location query
Response time/Query processing cost
 Return messages from materialized view
System overhead
 Cost to update the materialized view with the user’s the offline
time and the friend’s update frequency
Shared push approach (based on per cell evaluation)


Response time
 Return messages from the shared view with filtering
System overhead
 Cost to update the shared view with the user’s update frequency
and friends’ minimum offline time
12
Challenges in Decision Model
■
Main Challenges:
 Guarantee a response time requirement for the user
 Do not overwhelm the system
 Consider the wide diversity of the user activity patterns in
social networking systems, e.g., offline times, update
frequencies
■
To favor user response time
 More spatial push approaches will be adapted
 System is overkilled to maintain a large number of materialized
views and continuous queries
■
To favor system overhead
 More spatial pull approaches may be adapted
 Users suffer significant delays to get their news feeds
13
Which is the Best Approach for a Query
 System-wide decision  Per-User decision
A
B
D
E
A
B
 Per-Query decision
(GeoFeed)
D
F
C
F
A
D
Users
Friends
OR
E
C
F
Users
Friends
D
B
E
C
F
Users
Friends
E
C
B
A
■
Consider the wide
diversity in user
activities in social
networking systems
e.g., offline times and
update frequencies
14
GeoFeed Decision Algorithm
■
Step 1. Response Time Guarantee
 For each user, this step uses our cost model to decide the
MAX number queries (N) to be evaluated by the spatial pull
approach
■
Step 2. Spatial Pull & Push Selection
 For each user, this step selects N queries to be evaluated by
the spatial pull approach based on our cost model
■
Step 3. Shared Push Refinement
 For each user, this step attempts to share the execution of
his/her friends’ queries that are selected to be evaluated by
the spatial push approach.
15
Experiments (1/4)
■
Data Sets
 Get final 646,697 tweets issued in State of Minnesota
 Use location information in tweets
 Coordinate locations
 Semantic location, e.g., a city name (use Google Geocoder)
■
Experimental Settings
 Based on a Postgresql database
 Based on the statistics from Facebook
 A set of evaluation experiments to get the parameters to build
the cost model and decision algorithm
16
Experiments (2/4)
■
Inside GeoFeed Decision model
(a) Offline time = 1 hour
■
(b) Offline time = 8 hours
Insights:




With the increase of Tu, more spatial pull approaches are selected.
When Tu=0 no spatial pull approaches are applied
When Tu=∞, GeoFeed aims to only minimize the system overhead through
employing much of the spatial pull approach.
Comparing two figures shows that with a smaller offline time, more spatial
push approaches are applied.
17
Experiments (3/4)
■
Compare with traditional approaches
■
Insights:
 Pure spatial pull has bad response time
 Pure spatial push had bad system overhead
18
Experiments (4/4)
■
System overall overhead
Insight:
 GeoFeed with shared push refinement has the similar response
time but saves significant in system overhead
19
System Prototype
■
Sindbad: A Location-Aware Social Networking
System (SIGMOD 2012 demo)
20
Conclusion
■
Location-Aware News Feeds
 Social relevance, i.e., a user’s friends/subscribed news
agents
 Spatial relevance, i.e., messages overlap user’s location
■
GeoFeed is an efficient system equipped with a smart
decision algorithm, which chooses the best approach
among spatial pull, spatial push and shared push to
evaluate location-aware news feed:
 Guarantee the user’s required response time
 Minimize the system overhead
21
Thanks
22

similar documents