Task Planning and Incentives in Ubiquitous Crowdsourcing

Task Planning and Incentives in
Ubiquitous Crowdsourcing
Uichin Lee
Recruitment Framework for
Participatory Sensing Data Collections
Uichin Lee
Participatory Sensing
• Allowing people to investigate processes with
mobile phones
• Community based data collection and citizen
science; offering automation, scalability, and
real-time processing and feedback
• Examples: taking photos of assets that
document recycling behavior, flora variety,
and green resources in a university
Participatory Sensing: Challenges
• Diverse users and participatory sensing
• How to identify participants to projects?
• Goal: devise a new recruitment framework
using availability and reputation
– Spatio-temporal availability based on mobility and
transport mode
– Reputation of data collection performance
Sustainability Campaigns
• GarbageWatch: The campus needs to divert 75% of its waste
stream from landfills, and effective recycling can help reach this
goal. By analyzing photos, one can determine if recyclables (paper,
plastic, glass, or aluminum) are being disposed of in waste bins, and
then identify regions and time periods with low recycling rates.
• What's Bloomin: Water conservation is a high priority issue for the
campus and efficient landscaping can help. By collecting (geotagged) photos of blooming flora, facilities could later replace high
water usage plants with ones that are drought tolerant.
• AssetLog: For sustainable practices to thrive on a campus, the
existence and locations of up-to-date “green" resources needs to be
documented (e.g., bicycle racks, recycle bins, and charge stations).
Sustainability Campaigns
System Overview
Recruitment Framework
• Qualifier: minimum requirements
– Availability: destinations and routes within space, time, and
mode of transport constraints
– Reputation: sampling likelihood, quality, and validity over
several campaigns or by campaign-specific calibration exercises
• Assessment: participant selection
– Identify a subset of individuals who could maximize coverage
over a campaign area and time period while adhering to the
required mode of transport
– Cost may be considered when selecting participants
• Progress review: checking “consistency”
– Review coverage and data collection performance periodically
– If participants are below a certain threshold, provide feedback,
or recruit more participants
Related Work
• Mobility models
– Location summarization for personal analytics: from location
traces to places (e.g., spatio-temporal clustering, density-based
clustering, reverse geo-coding)
– Location prediction to adapt applications: mostly for locationbased services (LBS); prediction methods include Markov
models, time-series analysis, etc.
• Reputation systems:
– Summation and average (e.g., Amazon review)
– Bayesian systems (e.g., Beta reputation system)
• Selection services:
– Online labor markets: M-Turk, GURU.com
– Sensor systems: traditional sensor networks focused on
coverage (or sensing in a predefined zone)
Coverage Based Recruitment
Mobility traces (say for every 30 seconds)
Density-based clustering to find “destinations” (or places)
Routes are points between destinations
Mode of transport is inferred (e.g., still, walking, running,
biking, or diving)
• Qualifier filters: e.g., selecting individuals with at least 5
destinations in a certain area in a week or individuals with
at least 7 unique walking routes during day time weekday
• Assessment:
– Given (1) a set of participants with associated costs and
spatial blocks w/ mode of transport over time, and (2) block
with certain utilities
– Maximize the utility under budget constraints (NP-hard)
• Greedy algorithm is known to achieve at least 63% from the
Coverage Based Recruitment
• Reviewing M*N spatio-temporal association matrix
– M rows: spatial blocks (100m*100m)
– N: distinct time slots in a day (cumulated over a week)
– Entry: the proportion of time spent in a spatial block (that
satisfies mode of transport and monitoring period
• Comparing two consecutive weeks (to check deviation)
– Singular Value Decomposition (SVD): U*∑*Vt
• U: patterns common across different time periods (days)
• ∑: singular values (σ1…σrank) show variance represented by each
Participation and Performance
Based Recruitment
• Cross-campaign vs. campaign-specific
• Focus on campaign-specific indicators
Timeliness (latency)
Relevancy (falls in phenomenon of interests)
Participation likelihood: whether an individual took a
sample when given the opportunity
• Beta reputation model w/ Beta distribution where α
(#success) β (#failure)
– Expected reputation: E = α/(α+β)
– Exponential averaging over time (w/ some aging factor w)
• Campaign deployment information:
• Ground-truth: experts traversed the routes
Coverage Based Recruitment
• Evaluated assessment methods:
– Random: select users from campaigns arbitrarily
– Naïve: select users who cover the most blocks overall
without considering coverage of existing participants
– Greedy: select users who maximize utility by
considering coverage of existing participants
Coverage Based Recruitment
• Consistency check for campaign coverage
(progress review)
(changed mode of transport:
from walking to driving)
Participation and Performance
Based Recruitment
• Evaluated participation likelihood
– Other metrics not considered (e.g., timeliness, relevancy, quality due
to the nature of projects, i.e., auto uploading)
• A user’s reputation after “AssetLog calibration exercise”
Participation and Performance
Based Recruitment
• Re-evaluating reputation over two weeks
– With (c) and without (d) exponential aging
• Greedy vs. naïve: if users’ coverage overlaps more,
there will be much difference..
• Across campaign consideration (due to individual’s
preference, performance may be different)
• Participants grew tried of collecting samples
• Participants reported that the act of data capture
should be streamlined so that it can be repeated
• Participants wanted visualization (e.g., map)
• Participants were generally OK with “minor” deviation
from their routes, but drastic change may require some
Dynamic Pricing Incentive for
Participatory Sensing
Juong-Sik Lee and Baik Hoh
Nokia Research
Pervasive and Mobile Computing 2010
• Dynamic return of investment (ROI) of
participatory sensing applications (different data
types, users’ context, etc)
• Fixed price incentives may not work well; further,
it’s hard to come up with an optimal price
• Reverse auction: users bid for selling their data,
and the buyer selects a predefined number of
lower bit price users
– Selling price dynamically changes
Reverse Auction
• A user’s utility: U(b) = (b-t)*p(b)
– b: received credit
– t: base value of the data (that a user think)
– p(b): winning probability
Problems with Reverse Auction
• Lost users may drop out of the system
• Incentive cost explosion happens when a system has
below a threshold number users (here, m)
– Those users can increase their bid as much as possible
• Solution: for each loss, buyer gives virtual participation
credit (of fixed amount α); credit cumulates over time
– Seller can use the credit to lower its bid (thus, increasing
winning probability)
Incentive cost explosion
Give “virtual credit” α
to losers
Credit-based Incentives
• Random Selection based Fixed Pricing (RSFP):
(+)Simple to implement
(+) Easy to predict total incentive cost
(-) Difficult optimal incentive price decision
(-) Unable to adapt to dynamic environments
• Reverse auction dynamic pricing w/ virtual credit (RADP-VPC)
(+) Eliminate complexity of incentive price decision
(+) Able to adapt to dynamic environments
(+) Minimize incentive cost
(+) Better fairness of incentive distribution
(+) Higher social welfare
(-) Relatively harder to implement than RSFP
Evaluation Model
• ROI till round r = [earning so far] / [# of
participation till round r]*[min reward]
• If ROI(r) drops below 0.5, a user drops out of the
• A user’s valuation is randomly generated based
on some distribution
• Evaluation items
– Incentive cost reduction
– Fairness against true valuation
– Service quality
Incentive Cost Comparison
Incentive Cost Reduction
Reverse auction dynamic pricing
with virtual participation credit
Fairness Against True Valuation
Service Quality Guarantee
• Privacy leak: one has to send data with bid
– Data encryption prevents the buyer from validating
the quality (how about using homomorphic crypto?)
• Data broker in between seller and buyer
– Data collection, maintenance, processing/mining
• Handling different types of apps (e.g., real-time
vs. asynchronous)
• How to guarantee data integrity and to maintain
seller’s reputation?

similar documents