Report

INTERVAL TREE & SEGMENTATION TREE Juho Lee, UiTae Kim UiTae Kim INTERVAL TREE Windowing Query Query for finding objects in rectangular region Let’s consider roads as line segments Let’s make the example even easier… Orthogonal Only two possible orientations Axis-parallel Data structure S: set of n axis-parallel line segments Query: find all line segments in S, intersecting W. W:=[x:x’] X [y:y’] We need a data structure for solving this window query efficiently Check segment as marked Query time: (log + ) Storage: ( log ) 1. 2. 3. 4. Entirely inside Intersects once Intersects twice (partially) overlaps boundary Lemma 10.1 Let S be a set of n axis-parallel line segments in the plane. The segments that have at least one endpoint inside an axisparallel query window W can be reported in O(logn+k) time with a data structure that uses O(nlogn) storage and preprocessing time, where k is the number of reported segments. ≔ ( = ) Now problem is reduced to… Checking whether the given line segments intersects ℓ: ≔ , ( ′ , ) intersects ℓ iff ≤ ≤ ′ Now the problem becomes 1-D intersection test Finding intersection in 1-D ≔ { 1 : 1′ , 2 : 2′ , … , : ′ } closed set of intervals median of 2n endpoints If < , we don’t need to consider about the segments placed to the right of . Finding intersection in 1-D Construct binary tree Root(or interim node): Left child: Tree made of line segments placed to the left of Right child: Tree made of line segments placed to the right of Finding intersection in 1-D Construct binary tree Make tree recursively Finding intersection in 1-D Construct binary tree is in interval : ′ ∈ iff ≤ can be done easily if we have sorted list Interval tree If = ∅, then the interval tree is leaf Otherwise, let root be ≔ : ′ ∈ : ′ < forms left subtree ≔ : ′ ∈ : ≤ ≤ ′ : stored twice ℒ (): sorted from left to right ℒℎ (): sorted from right to left ℎ ≔ { : ′ ∈ : < } forms right subtree Interval tree Lemma 10.2 An interval tree on a set of n intervals uses O(n) storage and has depth O(logn). Lemma 10.2 pf) Depth part is trivial, since we are storing 2n endpoints in a binary tree. O(log n) Lemma 10.2 pf) For storage bound, note that , , ℎ are all disjoint. Thus, each segments are stored only twice, one for ℒ (), and the other for ℒℎ () Hence, the storage bound is O(n). The tree uses O(n) also. Algorithm: ConstructIntervalTree ConstructIntervalTree(I) Input: A set I of intervals on the real line Output: The root of an interval tree for I Algorithm: ConstructIntervalTree 1. if = ∅ 2. then return an empty leaf 3. else Create a node ν. Compute , the median of the set of interval endpoints, and store with ν. 4. Compute and construct two sorted lists for : a list ℒ () sorted on left endpoint and a list ℒ ℎ () sorted on right endpoint. Store these two lists at ν. 5. lc(ν)← ConstructIntervalTree( ) 6. rc(ν)← ConstructIntervalTree(ℎ ) 7. return v Algorithm: ConstructIntervalTree Finding median: O(n), but better to keep sorted list. O(nlogn) Let ≔ ( ), then creating ℒ , ℒℎ takes ( log ) Each step takes ( + log ) Using similar arguments in Lemma 10.2, ( log ). Lemma 10.3 An interval tree on a set of n intervals can be built in O(nlogn) time. Algorithm: QueryIntervalTree QueryIntervalTree(v, ) Input: The root v of a interval tree and a query point Output: All intervals containing Algorithm: QueryIntervalTree 1. if ν is not a leaf 2. then if < () 3. then Walk along the list (), starting at the interval with the leftmost endpoint, reporting all the intervals that contain . Stop as soon as an interval does not contain . 4. QueryIntervalTree(lc(ν), ) 5. else Walk along the list ℎ (), starting at the interval with the rightmost endpoint, reporting all the intervals that contain . Stop as soon as an interval does not contain . 6. QueryIntervalTree(rc(ν), ) Algorithm: QueryIntervalTree For each step, (1 + ), where is the # of pts reporting We go through (log ) steps, and Query time is (log + ) = . Theorem 10.4 An interval tree for a set I of n intervals uses O(n) storage and can be built in O(nlogn) time. Using the interval tree we can report all intervals that contain a query point in O(logn+k) time, where k is the number of reported intervals. Query time: (log + ) Storage: () Query time: (log + ) Storage: ( log ) 1. 2. 3. 4. Entirely inside Intersects once Intersects twice (partially) overlaps boundary Extend query to segment Let ⊆ be the set of horizontal segments in . Query becomes ≔ × [ : ′ ]. ∈ looks like ≔ [ : ′ ] × . Extend query to segment Recursively traveling through the left(right) subtree is still correct Extend query to segment However, treat for the is no longer correct [ : +∞)× [ : ′ ] if < Extend query to segment However, treat for the is no longer correct Perform range search for the left(right) endpoints Query is (−∞: ]× [ : ′ ] (or [ : +∞)× [ : ′ ]) Uses ( log ) storage, (log + ) query time Data structure Main structure: interval tree Τ on x-intervals of segments Instead of ℒ (), ℒℎ (), we have range search tree Τ (), Τℎ (). Data structure Replaced to range tree Data structure Range tree takes (log + ), instead of ( ). Total time for each step becomes (log 2 + ) Storage complexity becomes ( log ), since range tree dominates the main interval tree Preprocessing time remains ( log ) Theorem 10.5 Let S be a set of n horizontal segments in the plane. The segments intersecting a vertical query segment can be reported in (log 2 + ) time with a data structure that uses ( log ) storage, where k is the number of reported segments. The structure can be built in ( log ) time. Lemma 10.1 Let S be a set of n axis-parallel line segments in the plane. The segments that have at least one endpoint inside an axisparallel query window W can be reported in O(logn+k) time with a data structure that uses O(nlogn) storage and preprocessing time, where k is the number of reported segments. Corollary 10.6 Let S be a set of n axis-parallel segments in the plane. The segments intersecting an axis-parallel rectangular query window can be reported in (log 2 + ) time with a data structure that uses ( log ) storage, where k is the number of reported segments. The structure can be built in ( log ) time. Summary Query Time: (log 2 + ) Storage: ( log ) Preprocessing:( log ) Jooho Lee SEGMENT TREE Problem Given n segments, find intervals which contain qx. Let solve it with BST. qx Elementary Intervals (): Open Interval []: Closed Interval At most 4n+1 elementary Intervals. 2n(end points)+2n(segments)+1(right side) { (-∞:p1) (p1:p2) [p1:p1] (p2:p3) [p2:p2] (p3:p4) [p3:p3] (p4:p5) (p5:p6) [p4:p4] [p5:p5] (p6:+∞) [p6:p6] =Elementary Intervals I } BST of Elementary Intervals Height of BST: O(log(4n+1))=O(log n) (-∞:p1) (p1:p2) [p1:p1] (p2:p3) [p2:p2] (p3:p4) [p3:p3] (p4:∞) [p4:p4] u (-∞:p1) [p1:p1] (p1:p2) [p2:p2] (p2:p3) [p3:p3] (p3:p4) Let Int(u) corresponding Interval with u. Int(u) is (p1:p2) interval. [p4:p4] (p4:∞) BST of Elementary Intervals At leaf node u, Store interval I containing Int(u). Query time: log(n)+k 1.find the leaf interval containing qx 2.report all intervals stored at the corresponding leaf node. q x Int (v) I (v) u (-∞:p1) [p1:p1] (p1:p2) [p2:p2] (p2:p3) [p3:p3](p3:p4) But Storage can be quadratic..O(n2) [p4:p4] (p4:∞) Qaudratic Storage Example storages for open leaf intervals :0 1 2 . . . . . . . . .N-1 N N-1 . . . . . . . . 2 1 0 Sum of storages for open leaf intervals is more than N(N+1)/2! Idea Internal node w = union of elemental intervals of leaves in the sub-tree(w)! v Segment Tree u Int(u) Int(v) (-∞:p1) [p1:p1] (p1:p2) [p2:p2] (p2:p3) [p3:p3](p3:p4) (-∞:p1) (p1:p2) [p1:p1] (p2:p3) [p2:p2] (p3:p4) [p3:p3] (p4:∞) [p4:p4] [p4:p4] (p4:∞) Which interval is contained at node v? Each node v store the Interval [ x : x`] I (v ) Int (v) [ x : x`] Int ( parent (v)) [ x : x`] If Int ( parent (v)) [ x : x`] , [x:x`] is stored at parent(v) or parentn(v) I such that Storage in Segment Tree Construction of Segment Tree Construction There is a node to split a path. Once the paths diverge, as we follow the left path, whenever the path goes to the left child, the right child must store [x:x`]. Similarly, as we follow the right path, whenever the path goes to the right child, the left child must store [x:x`]. So, For one interval insertion, we just travel only at most two pat hs. Since Height is O(log N), time complexity to build segment tree will be O(N log N) Reporting Query with Segment Tree qx v q x Int (v) I (v) lc(v) rc(v) Int(lc(v)) O(log n+k) for a query! Int(lc(v)) Int(v) Storage Complexity A segment tree on a set of n intervals uses O(n log n) storages. Height of Tree H: O(log n) For any [ x : x`] I , contained by at most two nodes at the same depth. Therefore, (number of segments)*(2*H) = O(n log n)! If v1,v3 contain [x:x`], pa rent(v2) contain [x:x`]. Contradiction! N log N storage Example N/2 segments for filling points. N/2 segments for storing log n times. Then N/2* log N=O(NlogN) for storing! Summary Query Time: O(log n + k) Storage: O(n log n) Preprocessing: O(n log n) Quiz For given n integers, give a algorithm to handle queries whic h is to calculate sum of interval (a, b). Hint Use segment tree. Solution Get rid of closed interval For each node v, calculate total of Int(v). Similar way with previous method.