### Applied Econometrics Maximum Likelihood Estimation

```Applied Econometrics
Maximum Likelihood Estimation and
Discrete choice Modelling
Nguyen Ngoc Anh
Nguyen Ha Trang
Content
• Basic introduction to principle of Maximum
Likelihood Estimation
• Binary choice
• RUM
• Extending the binary choice
– Mutinomial
– Ordinal
Maximum Likelihood Estimation
• Back to square ONE
• Population model: Y = α + βX + ε
– Assume that the true slope is positive, so β > 0
• Sample model: Y = a + bX + e
– Least squares (LS) estimator of β:
bLS = (X′X)–1X′Y = Cov(X,Y) / Var(X)
ˆ1 
 x  x y
 x  x 
i
2
i
• Key assumptions
– E(|x) = E( ) = 0  Cov(x, ) = E(x ) = 0
– Adding: Error Normality Assumption – e is idd
with normal distribution
i
Maximum Likelihood Estimation
• joint estimation of all the unknown parameters of
a statistical model.
•  that the model in question be completely
specified.
•  Complete specification of the model includes
specifying the specific form of the probability
distribution of the model's random variables.
•  joint estimation of the regression coefficient
vector β and the scalar error variance σ2.
Maximum Likelihood Estimation
• Step 1: Formulation of the sample likelihood
function
• Step 2: Maximization of the sample likelihood
function with respect to the unknown
parameters β and σ2 .
Maximum Likelihood Estimation
• Step 1
• Normal Distribution Function: if
Then we have the density function
From our assumption with e or u
Maximum Likelihood Estimation
• Substitute for Y:
• By random sampling, we have N independent
observations, each with a pdf
• joint pdf of all N sample values of Yi can be
written as
Maximum Likelihood Estimation
• Substitute for Y
Maximum Likelihood Estimation
• the joint pdf f(y) is the sample likelihood
function for the sample of N independent
observations
• The key difference between the joint pdf and
the sample likelihood function is their
interpretation, not their form.
Maximum Likelihood Estimation
• The joint pdf is interpreted as a function of the
observable random variables
for
given values of the parameters
and
• The sample likelihood function is interpreted
as a function of the parameters β and σ2 for
given values of the observable variables
Maximum Likelihood Estimation
• STEP 2: Maximization of the Sample likelihood
Function
• Equivalence of maximizing the likelihood and
log-likelihood functions : Because the natural
logarithm is a positive monotonic transformation,
the values of β and σthat maximize the likelihood
function are the same as those that maximize the
log-likelihood function
• take the natural logarithm of the sample
likelihood function to obtain the sample loglikelihood function.
Maximum Likelihood Estimation
• Differentiation and prove that
• MLE estimates is the same as OLS
Maximum Likelihood Estimation
Statistical Properties of the ML Parameter
Estimators
1. Consistency
2. 2. Asymptotic efficiency
3. Asymptotic normality
Shares the small sample properties of the OLS
coefficient estimator
Binary Response Models: Linear
Probability Model, Logit, and Probit
•
Many economic phenomena of interest, however, concern
variables that are not continuous or perhaps not even
quantitative
– What characteristics (e.g. parental) affect the likelihood
that an individual obtains a higher degree?
– What determines labour force participation (employed vs
not employed)?
– What factors drive the incidence of civil war?
Binary Response Models
• Consider the linear regression model
• Quantity of interest
Binary Response Models
• the change in the probability that Yi = 1
associated with a one-unit increase in Xj,
holding constant the values of all other
explanatory variables
Binary Response Models
Binary Response Models
• Two Major limitation of OLS Estimation of
BDV Models
– Predictions outside the unit interval [0, 1]
– The error terms ui are heteroskedastic – i.e., have
nonconstant variances.
Binary Response Models: Logit - Probit
Binary Response Models
• Latent variable approach
• The problem is that we do not observe y*i.
Instead, we observe the binary variable
Binary Response Models
Binary Response Models
• Random utility model
Binary Response Models
• Maximum Likelihood estimation
• Measuring the Goodness of Fit
Binary Response Models
• Interpreting the results: Marginal effects
– In a binary outcome model, a given marginal
effect is the ceteris paribus effect of changing one
individual characteristic upon an individual’s
probability of ‘success’.
STATA Example
•
•
•
•
Logitprobit.dta
Logitprobit description
STATA command
Probit/logit inlf nwifeinc ed exp expsq age
kidslt6 kidsge6
• dprobit inlf nwifeinc ed exp expsq age kidslt6
kidsge6
```