DCIM - People

DCIM: Distributed Cache Invalidation Method
Authors: Kassem Fawaz and Hassan Artail
Alex Buchholz
CS 6204
April 25th, 2013
Presentation Overview
1. Paper overview
– Main concepts/definitions
Related Works
DCIM Architecture and Operations
Experimental Results
DCIM Overview
Distributed Cache Invalidation Method
• Client-Based Cache Consistency Scheme
• Pull Based
• Implements:
• Provides “near strong consistency”
“this is the first complete client side approach employing
adaptive TTL and achieving superior availability, delay, and
traffic performance”
Cache Consistency
• Strong : cached items are identical to server
• Weak: client queries might get stale data items
• Delta: data stale for a period of time
• Probabilistic: data is consistent with a
probability p
• Probabilistic-delta: data item is at most delta
units of time stale with a probability not less
than p
Three Main Cache Mechanisms
1. Push: Server informs change
2. Pull: Client asks for change
3. Hybrid: Both
Time to Live (TTL)
• Scheme: TTL stored along with value
– Data considered good until TTL has passed
• Popular because:
a) Simplicity
b) Good performance
c) Flexibility to assign TTL values
• Completely client based
Two Ways:
1. Cache piggybacks a list of invalidated
documents when communicating with the
2. Server piggybacks a list of updated
documents when it communicates with the
• Client pulls data before needed
– based on that items request rates
• Helps increase availability and reduce wait
Pull Based Approaches
1. Client Polling
2. Time to Live
Client Polling
• Cache validation initiated on client’s schedule
• To achieve strong consistency:
 Validate each item before being served
*** Low bandwidth, high latency ***
Two TTL Approaches
• Fixed: A set TTL value for all data items
• Adaptive: provides higher consistency
requirements along with lower traffic
Adaptive TTL Approaches
5 Discussed in the paper
1. TTL = factor * time difference btw query and last
2. TTL = factor * last update interval
3. TTL = time difference btw query time and kth
recent update divided by factor, server replays
cache at k most recent update times
4. Consider complete update history and predict
future updates
5. TTL = probability describing staleness of cache
• DCIM builds on top of COACS
– Cooperative caching architecture
• Two types of nodes
– Caching nodes (CNs): cache previously requested
– Query Directories (QD): index queries along with
the addresses of the caching nodes
• Either can be a Requesting Node (RN)
– Node that requests data items
COACS Operation
1. RN wants data – scans QD’s sequentially from
nearest to farthest for same query
– Far less QD’s than CN
2. When the same query is found at a QD, it
forwards the request to the CN which has the
3. If all are traversed and no queries found
(cache miss), the last QD queries the server
4. RN -> CN
Design Methodology
GOAL: improve efficiency of cache updating in
MANETs without requiring mobile devices to maintain
cache state information
• Pull-based: CN’s monitor TTL of cache
• Scalable: anyone can become a CN if what
they requested has not been cached
• Adaptive TTL values: each CN estimates the
interupdate interval and uses to set TTL
Piggybacking in DCIM
• CN polls server often to know update times of items it
has cached
• Piggybacks requests refresh of cached items each
time contacts server
• To avoid unnecessary packets, two-phase approach:
1. After eachfor
interval (T
) CNs revalidate data
with expired
at TTL
with high update rate
one item
2. least
After a configurable
of polling
(N ), Cn
are at most
if ensures
at least that
has passed
TTL one piggybacking interval stale
DCIM Basic System Overview
Two CNs are sending cache validation
requests to the server (dotted arrows) via
gateway nodes and through the Access
The server replies back with lists of valid
and changed data items (short-dashed
arrows) to the CNs
Which in turn update the corresponding
QDs asynchronously about the items they
cache (long-dashed arrows).
Basic Data Request
• All CN’s have insight into update patterns
1. Store last update times for all data in server
= info
(1 to– predict
α) * IUI(t-1)
Uses this
next update time
• HOWEVER, CN are constrained
= estimated
interarrival time at time t
• DCIM uses aLUI
to estimate
= lastaverage
interval interupdate
the btw
α =at
0.1 and 0.2 (0.125 in this paper)
• CN only needs to store the estimated interval and the
last updated time.
simplicity and ease of computation
minimum amount of data required
diminishing weights assigned to older data
Server Operations
Server only reacts to the received CURP
messages -> doesn’t need to be aware of
MANET dynamics
flow at
DCIM could
in an Internet environment using
HTTP header fields
CURP: Cache Update Request
SVRP: Server Validation Reply
SUDP: Server Update Data
QD Operations (COACS)
• Elected based on resource capability
• # bounded by two limits:
1. Lower bound: having enough such that an
additional QD will not yield enough reduction in
workload to be worth it
2. Upper bound: corresponds to delay threshold
• # can change dynamically (7<->100 in sims.)
• Typically experience 1.5x the load of a CN
CN Processing
The process that runs on the CN includes two threads:
1. General
A monitoring
Elements the
Cache Information
2. A processing thread
Elements of the Query-Specific Cache Information Table
CN Monitoring Thread
1. Checks for expired data items
2. Issues validation requests
3. Requests for data items
It performs these in two functions:
1. Inner Loop Function:
2. Outer Loop function
Inner Loop Function
Outer Loop Function
Outer Loop Function
Inner loop function
Npoll passed?
CN Processing Thread
1. Processes data requests (DRPs) from RNs
2. Processes SUDP and SVRP messages in
response to CURP messages
3. Also computes TTL values
Processing Data Requests
• CN checks the requested item in the DRP
• If INVALID -> issues an update request directly
to the server
– Changes state to TEMP_INVALID. Places query in
waiting list
• Otherwise, the query is processed by sending
the request item to the RN view a DREP
Processing SVRP and SUDP
• SUDP – if received, must be for an item that has
changed at the server
– CN calculates TTL. If SUDP makes reference to items
that have requests placed in the waiting list, those
items are sent to the corresponding requesting nodes
• SVRP – sent from the server in response to a
CURP packet
– there are items which were specified in the CURP
packet but not sent as part of the SVRP because the
actual updated data items were sent to the CNs as
part of the SUDP message
SVRP: Server Validation Reply
SUDP: Server Update Data
TTL Calculation
• Exact TTL calculation depends on
whether the item was expired at
the server or not
Average TTL versus inverse of interupdate interval.
Inverse of update rate (s)
Shows that at very low update rates (less than 1
update per 1,000 s), the estimated TTL does not
adapt well. However, in actuality, time goes beyond
the 2,000 s considered for this simulation time,
meaning that more item updates will occur on the
server during the longer time interval.
Analysis Overview
• DCIM analyzed to assess:
– Bandwidth gain
– Query response time gain
• Compared to two consistency schemes:
– Poll-every-time (PET)
– Push-based stateful (PBS)
Analysis Defined
• Bandwidth gain: difference btw DCIM and PET, PBS
• Query response time gain: difference btw times it takes to
answer queries
• Data requests defined by homogeneous Poisson distributions:
– λR = rate of requests, λU = item update rate
Response Time Gain
Given HD = 5, HR = 5.21, HC = 5.21, Tin = 5ms,
Tout = 70 ms, λU = 1/500
(2) and (3) derived response time gain of DCIM over PET and PBS
Bandwidth Gain
D is the disconnection interval
N is the number of cached items
E[Ndisc] is the CN disconnection rate
Given λR = λU = 1/500, N = 4,000, M = 20,
SR = 0.5 KB, SD = 10KB, SU = 0.15 KB, q =
20, E[Ndisc] = 15, and D = 20 s
Analysis Results
• Response time gain mainly depends on update rate
– In DCIM the majority of the requests are answered from the MANET,
this is why the time response difference is less than 10 ms
• The traffic resulting from large piggybacking intervals is lower
than that of small piggybacking intervals
• The traffic demands for DCIM decrease exponentially for small
polling intervals
• DCIM is less costly than PBS
*** The polling interval is the main parameter that affects performance ***
Optimal polling interval is the value that results in the lowest traffic consumption –
equivalent to the high bandwidth gain since poll every time does not depend on the
polling interval
Experimental Results Overview
• Implemented using ns2
• Two additional schemes were implemented for comparison:
Three version of DCIM implemented
Poll every time: validate each item each time it’s requested
Fixed-TTL: each item has same TTL value
Piggybacking is disabled
“prefetch” is always set
Only implements the update rate adaptation mechanism – items are
validated when they expire
400x400 m2 area, 100 nodes, 2 Mbps node bitrate, 2 m/s
node mobility, etc. Rest shown on next slide
Experimental Results Overview
Experimental Results Overview
• The reported results are from 5 experiments that involve varying:
request rate
update rate
item popularity
maximum velocity
polling interval
data item size
• The results are:
Consistency ratio
Query delay
Cached data query delay
Uplink traffic
Downlink traffic
Average overhead traffic
Varying Request Rate
• Varied between 5s and 120s
• Piggybacking -> appropriate TTL value -> high
consistency ratio
• Prefetching -> high hit ratio -> lower delays
• Query delay gets smaller after the item is cached
increases by a small margin due to less prefetching
• DCIM consumes more traffic on the server side due to
• As for the node traffic, by piggybacking large amount
of items, DCIM consumes more traffic when compared
to other approaches.
• However, as the request rate decreases, prefetching
does not happen that often, and this leads to lower
traffic as shown in the graph. This is how DCIM adapts
prefetching to the request rate of items.
Varying Update Rate
• TTL < 100 s is less than the interupdate
intervals in all of the scenarios simulated ->
provide the best consistency level.
• Fixed TTL approaches have higher hit rates
than poll every time, but less than DCIM
• Delay after caching for DCIM < polling
every time and fixed-100, but it may
exceed that of fixed-500 that keeps the
element for a longer time.
• Traffic not high at the server, very low in
–less than 10 kbps, while the bandwidth is 2
• Piggybacking of requests -> traffic increase
–Increases in frequency as update rates increase
• Without this traffic though, the CNs would
not be able to infer the update rate and
calculate reliable TTL estimates.
Varying Zipf (Popularity)
• Effectively varies the popularity of
the data items
– Analogous to varying the items’
request rate
• Zipf increases -> diversity of items
– smaller subset of the items is
requested more
• TTL saved for all items regardless of
their request rates
– constant consistency at 98 percent
• Zipf increases -> smaller set of
requests -> increases the
probability of hits. It is
• Through prefetching -> constant hit
Varying Node Velocity
• Varied between 0 m/s and 20
• No special outcome
• Mild increase in the delay
Varying Polling Interval
• Polling interval is increased from 1
to 50 s
• Mild decrease in the consistency
ratio and hit ratio
– an increase in the delay
• Piggyback interval increases due to
the decrease of hit rate
• By increasing the polling interval
the validation requests from the
inner loop function become farther
apart in time
• Piggybacking large - > CN will
predict that items will be requested
before the end of this interval
– leads to more prefetching and
consequently more traffic
Varying Data Item Size
• 5 KB and 1,500 KB
• DCIM at high data sizes still
demands less traffic than the
network bandwidth
• Traffic per node increases
linearly with the increase in
data item size
Effectiveness of DCIM
• Better consistency over its variants
• Piggybacking offers an added value
by providing a better estimate of TTL
– consequently higher data consistency
• Piggybacking induces more overhead
– validation requests and prefetching
• Prefetching highly requested items
saves on traffic and provides
acceptable query delay
• Piggybacking increases data
• Prefetching adapted to the request
rate controls overhead traffic
Comparison between DCIM and
Other Approaches
• DCIM is a client-based cache consistency scheme for MANETs
• Relies on estimating the inter update intervals of data items to set their
expiry time.
• Uses piggybacking and prefetching
– increases accuracy of estimation
– reduce both traffic and query delays
• Compared DCIM approach to fixed TTL and client polling and to two
server-based approaches SSUM and UIR
• Showed that DCIM provides a better overall performance than the other
client- based schemes and comparable performance to SSUM

similar documents