Report

S1: Chapter 8 Discrete Random Variables Dr J Frost ([email protected]) Last modified: 25th October 2013 Variables and Random Variables In Chapter 2, we saw that just like in algebra, we can use a variable to represent some quantity, such as height. A random variable represents a single experiment/trial, and has 2 ingredients: 1. THE OUTCOMES /SUPPORT VECTOR We tend to use a lowercase variable letter to represent the value of the outcome. A random variable representing the throw of an unfair die: x 1 2 3 4 5 6 P(X=x) 0.3 0.2 0.1 0.25 0.05 0.1 2. PROBABILITY FUNCTION e.g. P(X = 3) = 0.1. It is just a function that maps an outcome to the probability. The probability function for a discrete variable has two obvious constraints: (a) The outputs have to be between 0 and 1, i.e. 0 ≤ = ≤ 1. (b) The sum of the outputs is 1, i.e. Σ ? = = 1 Is it a discrete random variable? The height of a person randomly chosen. The number of cars that pass in the next hour. The number of countries in the world. No Yes This is a continuous random variable. No Yes No Yes It does not vary, so is not a variable! Notation and Terminology The are two equivalent ways of writing a probability: Full notation: ( = ) e.g. P(C = blue) This says “the probability the outcome of an experiment, represented by the random variable , is the value . Shorthand: () e.g. p(blue) Note the lowercase instead of uppercase. Because a probability function is ultimately just a function, on the rare occasion it’s written as (), where x is a particular outcome. Don’t be upset by this. Example The random variable represents the number of heads when three coins are tossed. Underlying Sample Space { HHH, HHT, HTT, HTH, THH, ? THT, TTH, TTT } Probability Function Num heads x 0 1 2 3 P(X=x) 1/8 3/8 3/8 1/8 ? The second way of writing it allows us to conflate outcomes with the same probability. In the Edexcel syllabus, we call the table a probability distribution and the latter form a probability function. The true distinction is slightly abstract/subtle: don’t worry about it for now! Exam Question Edexcel S1 May 2012 (Hint: Use your knowledge that Σ = 1) p(-1) = 4k, p(0) = k, p(1) = 0, p(2) = k And since Σ = 1, 4k + k + 0 + k = 6k = 1 ? 1 Therefore = 6 Exercise 8A 5 The random variable X has a probability function P(X = x) = kx, x = 1, 2, 3, 4. 1 Show that = 10 7 The random variable X has a probability function: where k is a constant. a) Find the value of k. b) Construct a table giving the probability distribution of X. 7a) k = 0.125 7b) ? x 1 2 P(X = x) 0.125 0.125 3 4 0.375 0.375 Probabilities of ranges of values x 1 P(X=x) 0.1 2 0.2 3 0.3 4 0.25 ? 1 < < 5 = 0.75 ? 2 ≤ ≤ 4 = 0.75 3 < ≤ 6 = 0.4? ≤ 3 = 0.6 ? 5 0.1 6 0.05 Cumulative Distribution Function (CDF) How could we express “the probability that the age of someone is at most 40”? ? 40 ≤ 40? F is known as the cumulative distribution function, where = ≤ (note the capital F) If X is the number of heads thrown in 2 throws... x 0 P(X=x) 0.25 ? 1 0.5? 2 0.25 ? x F(x) 1 0.75 ? 2 1 ? 0 0.25 ? Example The discrete random variable X has a cumulative distribution function () defined by: = a + ; 8 x = 1, 2 and 3 Find the value of k. F(3) = 1. Thus ? k = 5. b Draw the distribution table for the cumulative distribution function. x F(x) c 1 3/4? 2 7/8? 3 1 ? Write down F(2.6) F(2.6) = F(2) ? = 7/8 d Find the probability distribution of X. x 1 P(X=x) 3/4? 2 3 1/8? 1/8? CDF F(x) p(x) 1 Shoe Size (x) ? Shoe Size (x) It’s just like how we’d turn a frequency graph into a cumulative frequency graph. Exam Questions Edexcel S1 May 2013 (Retracted) = 0.4 ? x 1 2 P(X = x) 0.4 ? 0.25 3 0.35 Edexcel S1 Jan 2013 F(3) = 1, so (27 +?k)/40 = 1, ... x 1 P(X = x) 0.35 ? 2 3 0.175 0.475 Expected Value, E[X] Suppose that we throw a single fair die 60 times, and see the following outcomes: x 1 2 3 4 5 6 Frequency 9 11 10 8 12 10 What is the mean outcome based on our sample? =?3.55 But using the actual probabilities of each outcome (i.e. 1/6 for each), what would we expect the average outcome to be? ? 3.5 If X is the random variable, [] is known as the expected value of . =Σ You could think of it as the weighted sum of the outcomes (where the weights are the probabilities) Quickfire E[X] Find the expected value of the following distributions (in your head!). x 1 2 3 x 4 6 8 P(X = x) 0.1 0.6 0.3 P(X = x) 0.5 0.25 0.25 E[X] = 2.2? E[X] = 5.5? x 10 20 30 P(X = x) 1/3 1/3 1/3 E[X] = 20 ? Harder Example x 1 2 3 4 5 P(X = x) 0.1 p 0.3 q 0.2 Given that E[X] = 3, find the values of p and q. p + q + 0.1 + 0.3 + 0.2 = 1 (1 x 0.1) + (2 x q) + (3 x 0.3) + (4 x q) + (5 x 0.2) = 3 ? Thus q = 0.1, p = 0.3 To E[X2] and beyond Remember with the mean for a sample, we could find the “mean of the squares” when finding variance, e.g. Σ 2 ? Σ We just replaced each value with its square. Unsurprisingly the same applies for the expected value of a random variable. Just replace with whatever is in the square brackets. Sorted! x 1 2 3 P(X = x) 0.1 0.5 0.4 E[X2] = (12 x 0.1) + (22 x 0.5) +? (32 x 0.4) = 5.7 E[2X] = (2 x 0.1) + (4 x 0.5) + ?(6 x 0.4) = 4.6 E[1 – X] = (0 x 0.1) + (-1 x 0.5) ?+ (-2 x 0.4) = -1.3 Variance We know how to find it for experimental data. How about for a random variable? Mean of the Squares Minus ? 2] Var[X] = E[X Square of the Mean ? 2 E[X] –? x 1 2 3 P(X = x) 0.1 0.5 0.4 Var[X] = 5.7 – 2.3?2 = 0.41 (We already worked out that E[X2] = 5.7) Exam Questions Edexcel S1 May 2010 a = 1/4 ? =1? E[X2] = 3.1 2 = 2.1 So Var[X] = 3.1 – 1 ? Edexcel S1 Jan 2009 ?= 1 = P(X <= 1.5) = ? P(X <= 1) = 0.7 E[X2] = 2. So Var[X] ? = 2 – 12 = 1 Coding! Oh dear god, not again... Recap Suppose that we have a list of peoples heights x. The mean height is 1.5m and the variance 0.2m. We use the coding = – : = −5.5? 2 = 1.8 ? It’s no different with expected values. What do we expect these to be in terms of the original expected value E[X] and the original variance Var[X]? E[X + 10] = E[X]?+ 10 E[3X] = 3E[X] ? ? Var[3X] = 9Var[X] Adding 10 to all values adds 10 to the expected value. Quickfire Coding Express these in terms of the original E[X] and Var[X]. ? +1 E[4X + 1] = 4E[X] ? E[1 – X] = 1 – E[X] Var[4X] = 16Var[X] ? Var[X + 1] = Var[X] ? ? Var[3X + 2] = 9Var[X] ? E[(X-1)/2] = (E[X]-1)/2 Var[(X-1)/2] = ¼ Var[X] ? Exercise 8E 2 E[X] = 2, Var[X] = 6 Find a) E[3X] = 3E[X]?= 6 d) E[4 – 2X] = 4 – 2E[X] ? =0 f) Var[3X + 1] = 9Var[X] ? = 54 5 The random variable Y has mean 2 and variance 9. Find: a) E[3Y+1] = 3E[Y] ? +1=7 c) Var[3Y+1] = 9Var[Y] ? = 81 e) E[Y2] = Var[Y] + ? E[Y]2 = 13 f) E[(Y-1)(Y+1)] = E[Y2 – 1] = E[Y ? 2] – 1 = 12 Discrete?Uniform distribution If X is the throw of a fair die, this obviously is its distribution... x 1 2 3 4 5 6 P(X = x) 1/6 1/6 1/6 1/6 1/6 1/6 We call this a discrete?uniform distribution. If had say an n-sided fair die, then: +1 = ? 2 1 = + 1? − 1 12 You won’t have exam questions on these, but they’re useful to know. Example Digits are selected at random from a table of random numbers. a) Find the mean and standard deviation of a single digit. b) Find the probability that a particular digit lies within one standard deviation of the mean. a) Our digits are 0 to 9. We have useful formulae when the numbers start from 1 rather than 0. If the digit is R, let X = R + 1 Then E[R] = E[X – 1] = E[X] – 1 = 11/2 – 1 = 4.5 Var[R] = Var[X – 1] = Var[X] 1 = 12 10 + 1 10 − 1 = 8.25 So = 2.87 (to 2sf) ? b) We want 4.5 − 2.87 < < 4.5 + 2.87 = 1.63 < < 7.37 = 2≤≤7 ? 6 = 10