Report

Chernoff Bounds, and etc. Presented by Kwak, Nam-ju Topics A General Form of Chernoff Bounds Brief Idea of Proof for General Form of Chernoff Bounds More Tight form of Chernoff Bounds Application of Chernoff Bounds: Amplification Lemma of Randomized Algorithm Studies Chebyshev’s Inequality Application of Chebyshev’s Inequality Other Considerations A General Form of Chernoff Bounds Assumption Xi’s: random variables where Xi∈{0, 1} and 1≤i≤n. P(Xi=1)=pi and therefore E[Xi]=pi. X: a sum of n independent random variables, that is, μ: the mean of X A General Form of Chernoff Bounds When δ >0 Brief Idea of Proof for General Form of Chernoff Bounds Necessary Backgrounds Marcov’s Inequality For any random variable X≥0, When f is a non-decreasing function, Brief Idea of Proof for General Form of Chernoff Bounds Necessary Backgrounds Upper Bound of M.G.F. Brief Idea of Proof for General Form of Chernoff Bounds Proof of One General Case (proof) Brief Idea of Proof for General Form of Chernoff Bounds Proof of One General Case Here, put a value of t which minimize the above expression as follows: Brief Idea of Proof for General Form of Chernoff Bounds Proof of One General Case As a result, More Tight form of Chernoff Bounds The form just introduced has no limitation in choosing the value of δ other than that it should be positive. When we restrict the range of the value δ can have, we can have tight versions of Chernoff Bounds. More Tight form of Chernoff Bounds When 0<δ<1 Compare these results with the upper bo und of the general case: Application of Chernoff Bounds: Amplification Lemma of Randomized Algorithm Studies A probabilistic Turing machine is a nondeterministic Turing machine in which each nondeterministic step has two choices. (coin-flip step) Error probability: The probability that a certain probabilistic TM produces a wrong answer for each trial. Class BPP: a set of languages which can be recognized by polynomial time probabilistic Turing Machines with an error probability of 1/3. Application of Chernoff Bounds: Amplification Lemma of Randomized Algorithm Studies However, even though the error probability is over 1/3, if it is between 0 and 1/2 (exclusively), it belongs to BPP. By the amplification lemma, we can construct an alternative probabilistic Turing machine recognizing the same language with an error probability 2-a where a is any desired value. By adjusting the value of a, the error probability would be less than or equal to 1/3. Application of Chernoff Bounds: Amplification Lemma of Randomized Algorithm Studies How to construct the alternative TM? (For a given input x) 1. Select the value of k. 2. Simulate the original TM 2k times. 3. If more than k simulations result in accept, accept; otherwise, reject. Now, prove how it works. Application of Chernoff Bounds: Amplification Lemma of Randomized Algorithm Studies Xi’s: 1 if the i-th simulation produces a wrong answer; otherwise, 0. X: the summation of 2k Xi’s, which means the number of wrongly answered simulations among 2k ones. ε: the error probability X~B(2k, ε) μ=E[X]=2k ε Application of Chernoff Bounds: Amplification Lemma of Randomized Algorithm Studies P(X>k): the probability that more than hal f of the 2k simulations get a wrong answe r. We will show that P(X>k) can be less tha n 2-a for any a, by choosing k appropriatel y. Application of Chernoff Bounds: Amplification Lemma of Randomized Algorithm Studies Here we set δ as follows: Therefore, by the Chernoff Bounds, Application of Chernoff Bounds: Amplification Lemma of Randomized Algorithm Studies To make the upper bound less than or eq ual to 2-a, Application of Chernoff Bounds: Amplification Lemma of Randomized Algorithm Studies Here, we can guarantee the right term is po sitive when 0<ε<1/2. Chebyshev’s Inequality For a random variable X of any probabilistic distribution with mean μ and standard deviation σ, To derive the inequality, utilize Marcov’s in equality. Application of Chebyshev’s Inequality Use of the Chebyshev Inequality To Calculate 95% Upper Confidence Limits for DDT Contaminated Soil Concentrations at a Using Chebyshev’s Inequality to Determine Sample Size in Biometric Evaluation of Fingerprint Data Superfund Site. Application of Chebyshev’s Inequality For illustration, assume we have a large body of text, for example articles from a publication. Assume we know that the articles are on average 1000 characters long with a standard deviation of 200 characters. From Chebyshev's inequality we can then deduce that at least 75% of the articles have a length between 600 and 1400 characters (k = 2). Other Considerations The only restriction Markov’s Inequality impose is that X should be non-negative. It even doesn’t matter whether the standard deviation is infinite or not. e.g. a random variable X with P.D.F. it has a finite mean but a infinite standard deviation. Other Considerations P.D.F. E[X] Var(x) Conclusion Chernoff’s Bounds provide relatively nice upper bounds without too much restrictions. With known mean and standard deviation, Chebyshev’s Inequality gives tight upper bounds for the probability that a certain random variable is within a fixed distance from the mean of it. Conclusion Any question?