### Moment Generating Functions

```Generating Functions
The Moments of Y
• We have referred to E(Y) and E(Y2) as the first and
second moments of Y, respectively. In general,
E(Yk) is the kth moment of Y.
• Consider the polynomial where the moments of Y
are incorporated into the coefficients

k
2
2
3
3
t
t E (Y ) t E (Y )
k
E (Y )  1  t E (Y ) 



2!
3!
k 0 k !
Moment Generating Function
• If the sum converges for all t in some interval |t| < b,
the polynomial is called the moment-generating
function, m(t), for the random variable Y.
t 2 E (Y 2 ) t 3 E (Y 3 )
m(t )  1  t E (Y ) 


2!
3!
• And we may note that for each k,
t k E (Y k )

k!
t k  y k p( y )
y
k!
(t y)k

p( y )
k!
y
Moment Generating Function
• Hence, the moment-generating function is given by
t 2 E (Y 2 ) t 3 E (Y 3 )
m(t )  1  t E (Y ) 


2!
3!

(t y )k
 
p( y )
k!
k 0 y
  (t y)k
  
y  k 0 k !

 p( y )

May rearrange,
since finite for
|t| < b.
  et y p( y )  E[ety ]
y
Moment Generating Function
• That is,
m(t )  E[ety ]
t 2 E (Y 2 ) t 3 E (Y 3 )
 1  t E (Y ) 


2!
3!
is the polynomial whose coefficients involve the
moments of Y.
The
th
k
moment
• To retrieve the kth moment from the MGF,
evaluate the kth derivative at t = 0.
d k [m(t )] k !t 0 E (Y k ) (k  1)!t 1E (Y k 1 ) t 2 E (Y k 2 )




dt
k!
(k  1)!
2!
• And so, letting t = 0:
k
d [m(t )]
 E (Y k )
dt
t 0
Common MGFs
• The MGFs for some of the discrete distributions
we’ve seen include:
binomial: m(t )  ( pet  q)n
pet
geometric: m(t ) 
t
1  qe
Poisson: m(t )  e
 ( et 1)
Geometric MGF
1
3
e
t
t
e
• Consider the MGF m(t ) 

t
t
2
1  3 e 3  2e
• Use derivatives to determine the first and second
moments.
t
m(t ) 
3e
 3  2e 
t 2
And so,
E (Y )  m(0) 
3e0
 3  2e 
0 2
3
 3
1
Geometric MGF
• Since m(t ) 
3et
 3  2e 
t 2
V (Y )  E (Y 2 )  [ E (Y )]2
• We have
m(t ) 
t
 3  2e 
t 3
And so,
E (Y )  m(t ) 
2
 15  (3)2  6
3e (3  2e )
t
3e (3  2e )
0
0
 3  2e 
0 3
 15
Geometric MGF
1
3
e
t
t
e
• Since m(t ) 

t
t
2
1  3 e 3  2e
is for a geometric random variable with p = 1/3,
our prior results tell us
E(Y) = 1/p and V(Y) = (1 – p)/p2.
1
1 1 3 2  9 
E (Y ) 
 3 and V (Y ) 
  6
2
13
1 3 3  1 
which do agree with our current results.
All the moments
• Although the mean and variance help to describe a
distribution, they alone do not uniquely describe a
distribution.
• All the moments are necessary to uniquely
describe a probability distribution.
• That is, if two random variables have equal MGFs,
(i.e., mY(t) = mZ(t) for |t| < b ),
then they have the same probability distribution.
m(aY+b)?
• For the random variable Y with MGF m(t),
consider W = aY + b.
m(t )  mY (t )  E[e ]
tY
mW (t )  E[et ( aY b ) ]
 E[e atY ebt ]
 e E[ e ]
bt
atY
 e mY (at )
bt
E(aY+b)
• Now, based on the MGF, we could again
consider E(W) = E(aY + b).
d bt
mW (t )  e mY (at )  ebt mY (at )(a)  mY (at )bebt
dt
bt
 e  amY (at )  bmY (at ) 
And so, letting t = 0,
E (W )  mW (0)  e0  amY (0)  bmY (0) 
 aE(Y )  b as expected.
V(aY+b)
• Now, based on the MGF, can you again
consider V(W) = V(aY + b).
2

mW (0)  E (W )  ?
• …and so V(W) = V(aY + b) = a2V(Y).
Tchebysheff’s Theorem
• For “bell-shaped” distributions, the empirical rule
gave us a 68-95-99.7% rule for probability a value
falls within 1, 2, or 3 standard deviations from the
mean, respectively.
• When the distribution is not so bell-shaped,
Tchebysheff tells use the probability of being
within k standard deviations of the mean is
at least 1 – 1/k2, for k > 0.
1
P(| Y   |  k )  1  2
k
Remember, it’s just
a lower bound.
A Skewed Distribution
• Consider a binomial experiment with n = 10
and p = 0.1.
P(| Y  1|  2(0.95))
1
 1  2  0.75
2
A Skewed Distribution
• Verify Tchebysheff’s lower bound for k = 2:
P(| Y  1|  2(0.95))
 P(0.9  Y  2.9)
1
 1  2  0.75
2
P(0.9  Y  2.9)  0.34868  0.38742  0.19371  0.93
```