The 2007 HMDA Data

Report
The National Mortgage Database (NMDB)
Robert Avery, Ken Brevoort, Theresa DiVenti, Carla Inclan, Ian Keith,
Jessica Lee, Lexian Liu, Ismail Mohamed, Forrest Pafenberg, Jay Schultz,
Cynthia Waldron, Xun Wang, Claudia Wood, Peter Zorn
Federal Housing Finance Agency
Consumer Financial Protection Bureau
Freddie Mac
Urban Institute
June 11, 2013
The views expressed are those of the authors and do not necessarily represent those of the Consumer
Financial Protection Bureau, the Federal Housing Finance Agency, Freddie Mac or their staff.
What is the NMDB?



A new, nationally representative, loan-level mortgage database jointly funded and managed by the
FHFA and CFPB based on a prototype developed by Freddie Mac.
» 1st lien mortgages reported to the credit bureaus are used as both the sampling frame and the
source of performance data. No new data is collected—the NMDB will make better use of
data that already exists.
» The database is a 1/20 sample (not a registry of loans).
» Because the credit bureaus archive their data, the NMDB recovers data that would have been
available had the project been started years ago. The initial 1/20 sample is representative of
all mortgages open at any time from January 1998 to June 2012 and (with weights) any
borrower who had at least one mortgage during that period.
» Going forward, a 1/20 representative sample of newly originated mortgages will be added
each quarter, and terminated mortgages will exit the sample.
» 10.1 million mortgages are in the initial historic database. In the future the database will track
about 3.5 million active mortgages.
Credit bureau data are comprehensive. However, they are raw servicing data which requires
significant cleaning to make them useful. Also need to add data from other sources.
» Major commitment of government staff to do this. Never done before.
» Working with active cooperation of credit bureau staff.
NMDB will also have a survey component. Each quarter a representative subset of borrowers
associated with loans newly added to the database will be sent a mail survey soliciting information
on their mortgage shopping and origination experience.
National Mortgage Database
2
Four Overlapping Databases


The basic unit of observation is the mortgage.
» The database will contain full credit information for all borrowers associated with the sampled
mortgages.
» Borrower data will be gathered from one year prior to sampled mortgage origination to one
year after termination and tracked quarterly.
» Performance on the sample mortgages will be collected monthly.
The NMDB will also make available an historic data base containing full credit data (including
scores) from 1998 to 2012 of a representative sample of borrowers associated with an active
mortgage during the 1998 to 2012 period.
» The database will contain information on all mortgages taken out by these borrowers during
the 1998 to 2012.
» Data also gathered on all other credit obligations active during this period.
» Performance for each mortgage will be tracked from 2000 to 2012.

The NMDB will maintain a separate database of a representative 1-in-20 sample individuals who
have ever had an active mortgage from 1998 onward.
» Quarterly information will be maintained on these individuals from one-year prior to taking out
their first mortgage (or 1998) until they die.
» Persons will be added to the database when they take out their first mortgage. .

The NMDB origination survey data will also be maintained as a separate database.
National Mortgage Database
3
Why is the NMDB Needed?
• HMDA:
– Not fully reasonably representative—does not include HMDA non-reporters.
– Lacks detailed borrower, loan or performance data.
– Available only 9 to 21 months after mortgages are originated.
• LPS McDash and/or CoreLogic:
– Servicing files from 26 large servicers versus 2,000 servicers in credit bureaus.
– Not representative—poor coverage of portfolio loans.
– Same problems as underlying NMDB data—duplication, hanging performance
and servicing sales—but not cleaned as the NMDB will be, so you don’t know it.
– No information on other obligations, previous or subsequent mortgages, or
borrowers.
• Problems with NY Fed Equifax:
– Similar source as NMDB, but unit of observation is borrowers not loans.
– Same problems as underlying NMDB—duplication, hanging performance and
servicing sales—but not cleaned as the NMDB will be, so you don’t know it.
– Little supplementation with other data and difficult to link files over time.
National Mortgage Database
4
What is Missing in Bureau Data?


Key items missing are property value (LTV) and characteristics, borrower
characteristics (e.g. age, race, income, gender), and some mortgage
characteristics (e.g. ARM status, PMI, origination channel).
The database is being supplemented with information obtained from
matching to existing external sources (some still under negotiation):
»
»
»
»
»
»
Home Mortgage Disclosure Act (HMDA) (70% match rate gives income/race).
Property transaction (deed/title) data (55% match rate).
MLS data (useful for purchase price in non-disclosure states).
Property appraisal data.
Household moving/address information on last three addresses.
Third party servicing data (e.g. LP, LPS). Private label MBS data. Maybe
securities data as well (e.g. Ginnie Mae, GSEs).
» Administrative files (FHA, VA, RHS, GSEs, home loan banks and possibly large
banks). 47% of sample loans are Gov’t-backed. An additional 17% of borrowers
have a non-sampled Gov’t backed loan.
» Data on age, gender and marital status from public records collected by the credit
bureau.
National Mortgage Database
5
What Specific Fields will NMDB Have?

For each sample mortgage the database will have:
» Monthly—Performance (delinquent, current, foreclosure); balance; scheduled
payment; actual payment; escrow payment; amortizing contract rate.
» Fixed Characteristics—Date opened; term; amount borrowed; number of
borrowers; mortgage purpose (home purchase, refinance, new mortgage on free
and clear property); owner occupancy status; type of mortgage
(FHA/VA/RHS/home improvement/manufactured housing/other); GSE
(Fannie/Freddie/Ginnie/Private MBS); servicer type; balloon amount and date;
appraised property value, APR, CLTV, LTV and DTI used in underwriting; ARM
status; PMI; date closed, payoff amount, and termination form (if closed).
» Modification/foreclosure status—date entered modification/foreclosure; change in
terms; special program (HARP/HAMP); part of bankruptcy; charge-off amount.

For each sample mortgage co-signer the database will have:
» Age (date of birth); gender; marital status; deceased indicator; race/ethnicity
(from HMDA); income at the time of origination (from HMDA);
» Quarterly Vantage Credit Score, bankruptcy, and income estimator
» Do they live in property associated with mortgage; first-time homebuyer; censustract/zip code and timing of last three addresses
National Mortgage Database
6
Specific Fields (continued)

For the property associated with each sample mortgage the database will
have:
» Quarterly—LTV; CLTV; and value (from AVM model).
» Fixed Characteristics—Date purchased; purchase amount; location (census tract,
MSA and Zip Code); type of property (e.g. single family); age of structure; square
footage; assessed value; owner-occupied.

For all concurrent 2nd liens on the property associated with each sample
mortgage the database will have:
» Monthly—Performance (delinquent, current, foreclosure); balance; scheduled
payment; actual payment; escrow payment; amortizing contract rate; credit limit
(if a HELOC).
» Fixed Characteristics—Date opened (piggie-back or not); term; open- or closedend; amount borrowed (or credit limit); number of borrowers; same servicer as
1st; date closed, payoff amount, and termination form (if closed).
National Mortgage Database
7
Specific Fields (continued)

For all other mortgages, credit cards, installment loans, student loans, auto
loans, lines of credit, and other consumer loans associated with sample
mortgage co-borrowers the database will have:
» Monthly—Performance (delinquent, current, foreclosure); balance; scheduled
payment; actual payment; escrow payment; amortizing contract rate; credit limit
(if open-ended).
» Fixed Characteristics—Type of credit; date opened; term; open- or closed-end;
amount borrowed (or credit limit); number of borrowers; same servicer/property
as sample mortgage; date closed, payoff amount, and termination form (if
closed).


Information on inquiries and public records for all borrowers associated with
sample mortgages will also be gathered.
An origination survey will be sent (mailed) to a representative subset of new
mortgagees in the database each quarter. The survey has been pre-tested
three times with response rates of 60 and 45 percent for the last two pilots.
The survey is designed to pick up information on issues like loan shopping
and suitability that are not available from any other source.
National Mortgage Database
8
Timeline



Contract signed with Experian on September 27, 2012.
Initial data delivery took place in December 2012—1/20 sample of all loans
in existence between January 1998 and June 2012 (10.1 million loans and
14.7 million borrowers after preliminary cleaning).
An analytic group at FHFA, Freddie Mac and CFPB is processing and
cleaning the data and will match it to external sources, impute data for loans
that cannot be matched, and develop a series of regular reports and queries
to facilitate use of the NMDB.
» It will likely take until next spring to finish cleaning the data.
» 8 FTEs working on the project—major commitment of FHFA.
» Many challenges in following people and mortgages (e.g. servicing is sold;
people die or are added to mortgages).

An existing pilot prototype dataset in development for 2 ½ years funded by
Freddie Mac (1/500 sample of loans outstanding since 2003).
» Prototype will be maintained and updated until at least summer 2013.
» Already used in FHFA’s 2012 HERA-mandated report.
» Pilot testing of an additional Origination survey and a Delinquency Survey.
National Mortgage Database
9
Access and the Future



NMDB is being set up as a public good. We believe that the contract signed
with Experian is a model for data access.
The challenge is to (1) protect borrower/lender personally identifiable
information and (2) provide useful data. Local geography is critical for
mortgage analysis.
Our solution:


Data is physically housed only on a FHFA/CFPB server.
Access, however, is allowed for any federal government/reserve bank/GSE
employee going through access process:
» Must sign an agreement not to reverse engineer identity of borrower or
lender. Severe penalties for violations of agreement.
» All work behind a firewall—data can’t be removed.
» NMDB software must support a variety of purposes—simple queries
(number of new mortgages in California) to complex research projects.
» We are working to allow broader academic/research public access via
Census-style programs.
National Mortgage Database
10
Examples of how NMDB can be used
 Example 1: Second liens
 Example 2: Loan performance transition matrix
 Example 3: Credit tightening
 Example 4: Market Comparisons
 All examples with 2010 data using the Prototype
National Mortgage Database
11
Example 1: Second liens
NMDB coverage is more extensive than HMDA’s
0.5
1.0
HMDA First Lien
HMDA Second Lien
0.0
Millions of Mortgages
1.5
NMDB First Lien
NMDB First with Second
NMDB Second by Second Open Date
2004
2005
2006
2007
2008
2009
2010
Open Date
National Mortgage Database
12
Example 1: Second liens (continued)
Default (90d or worse)
Default (90d or worse)
Default rates are higher for firsts with seconds
First with Second
40%
First without Second
30%
20%
10%
0
Q1
Q2
2004
Q3
Q4
Q1
Q2
2005
Q3
Q4
Q1
Q2
2006
Q3
Q4
Q1
Q2
2007
Q3
Q4
Q1
Q2 Q3
2008
Q4
Q1
Q2
2009
40%
30%
20%
10%
0
2004
2005
2006
2007
2008
2009
First Lien Open Date
First with Concurrent non HELOC
First with Concurrent HELOC
First with Subsequent non HELOC
First with Subsequent HELOC
National Mortgage Database
First without Second
13
Example 1 continued
Performance of firsts w/ different types of seconds
Concurrent
Seconds and Firsts perform similarly
Seconds perform better
Seconds perform worse
ALL
87%
8%
5%
Subsequent
HELOC
non HELOC
HELOC
non HELOC
89%
7%
4%
87%
8%
5%
88%
7%
4%
78%
15%
7%


88% of firsts and their associated seconds perform similarly

GSE firsts and their associated seconds perform better than
non-GSE loans.

Firsts with piggyback non-HELOC (closed end) seconds have
the highest default rates.
When performance diverges, seconds tend to out-perform their
associated firsts.
National Mortgage Database
14
Example 2: Loan performance transition matrix
60D+ loans tend to worsen in performance
May 2010 Performance
April 2010 performance
Total
Current
30D
60D
90D
120+ D
FCL
No hist
Closed
Current
95.36%
0.96%
0.06%
0.02%
0.04%
0.02%
1.83%
1.72%
100%
30D
26.36%
41.84%
25.33%
0.42%
0.20%
0.18%
3.92%
1.75%
100%
60D
7.23%
12.83%
35.13%
38.33%
0.27%
0.97%
3.11%
2.13%
100%
90D
6.20%
1.71%
5.40%
25.27%
46.50%
8.31%
5.18%
1.45%
100%
120+ D
2.91%
0.75%
0.48%
1.04%
75.56%
9.90%
6.83%
2.53%
100%
FCL
1.59%
0.06%
0.03%
0.00%
2.53%
85.27%
3.82%
6.70%
100%
No hist
10.43%
0.11%
0.23%
0.04%
0.79%
0.28%
86.51%
1.60%
100%
82.12%
1.72%
0.86%
0.48%
2.24%
2.15%
8.59%
1.84%
100%
Row Percent
Total

95% of current mortgages remain current the next month.

Slightly over 40% of 30-day delinquent loans remain 30-day delinquent the next
month (the mode), with roughly equal percentages transitioning into current and 60day delinquent.

The disproportionate share of loans delinquent 60 or more days transition into an
even worse performing state the next month.
National Mortgage Database
15
Example 2: Loan performance transition matrix (continued)
First without seconds cure more frequently
Current
Current
30 D
100
20
95
15
90
10
85
5
80
90 D
120+ D
Firs ts no Sec ond
Firs ts w ith Sec ond
0
35
30
30 D
60 D
25
20
50
40
45
35
40
30
35
25
30
20
20
25
40
15
20
35
10
15
30
5
10
25
0
5
20
20
20
20
30
15
15
15
25
10
10
10
20
5
5
5
15
0
0
0
10
60 D
50
45
40
35
90 D
60
55
50
45
Jun
06
Jun
07
Jun
08
Jun
09
Jun
06
Jun
07
Jun
08
Jun
09
Jun
06
Jun
07
Jun
08
Jun
09
Jun
06
Jun
07
National Mortgage Database
Jun
08
Jun
09
Jun
06
Jun
07
Jun
08
Jun
09
16
Example 3
Credit quality of originations
Vantage Sc ore Dis tribution for Purc has e and Refinanc e Originations
1000
V ant age S cor e
900
800
700
600
500
H1
H2
2003
H1
H2
2004
H1
H2
2005
H1
H2
2006
H1
H2
2007
H1
H2
2008
H1
H2
2009
H1
H2
2010
Originat ion Dat e
Purchase
Ref inance
Not e: The dat a are weight ed values f rom t he NMDB and include jumbo loans. Purpose is ident if ied using credit bureau and HMDA dat a. The box represent s
t he middle 50% of t he observat ions, t he median is marked by t he whit e line in t he box and t he dot t ed lines ext end t o t he 5t h and 95t h percent iles. The widt hs
of t he boxes are proport ionat e t o t he volume of loans.

Score distributions are the tightest (lowest risk) since 2003.
» The Vantage score cutoff, as measured by the 5th percentile of the score distribution, is currently
set higher for both purchase and refinance loans than at any point since 2003.

Refinance mortgages appear to face an especially high score cutoff.
National Mortgage Database
17
Example 3 continued
Credit quality of originations—GSE comparison
Va n ta g e Sc o re Di s tri b u ti o n fo r Co n v e n ti o n a l , Co n fo rm i n g GSE a n d n o n -GSE Pu rc h a s e a n d Re fi n a n c e Ori g i n a ti o n s
1000
V antage S core
900
800
700
600
500
1000
V antage S core
900
800
700
600
500
H1
H2
2003
H1
H2
2004
H1
H2
2005
H1
H2
2006
H1
H2
H1
2007
H2
2008
H1
H2
2009
H1
H2
2010
Originat ion Dat e
GSE Pur chase
non- GSE Pur chase
GSE Ref inance
non- GSE Ref inance
Not e: The mar ket ar e weight ed values f rom t he NMDB and exclude u
j mbo and FHA/ VA loans. Purpose is ident if ied using cr edit bureau and HMDA dat a. The box r epresent s
t he middle 50% of t he obser vat ions, t he median is marked by t he whit e line in t he box and t he dot t ed lines ext end t o t he 5t h and 95t h percent iles. The widt hs
of t he boxes are proport ionat e t o t he volume of loans.

The credit quality of GSE loans significantly exceeds that of non-GSE loans,
especially for purchase money mortgages.
National Mortgage Database
18
Example 4: Market comparisons
Comparison of FHA Originations
Va n ta g e Sc o re Di s tri b u ti o n fo r No n -FHA L o a n s a n d FHA L o a n s
V ant age S core (rescal ed)
800
750
700
650
600
H1
H2
2000
H1
H2
2001
H1
H2
2002
H1
H2
2003
H1
H2
2004
H1
H2
H1
2005
H2
2006
H1
H2
H1
2007
H2
2008
H1
H2
2009
H1
H2
2010
H1
H2
2011
Originat ion Dat e
Non- FHA/ VA
FHA/ VA
Not e: The mar ket is weight ed values f rom NMDB and includes jumbo loans. The box r epresent s t he middle 50% of t he obser vat ions, t he median is mar ked by t he whit e line in t he box
and t he lines ext end t o t he 5t h and 95t h percent iles. The widt hs of t he boxes are pr opor t ionat e t o t he volume of o
l ans.
 The credit quality of FHA/VA market originations is consistently lower
than that of non-FHA/VA market originations.
 This difference in quality diminished somewhat during the height of
the boom (2004 through 2006), and has increased since 2007.
National Mortgage Database
19
Example 4: Market comparisons (continued)
Monitoring and benchmarking FHA
• Monitoring—As of June 2010 (for loans originated since 2003):
– 13.4% of FHA loans were either in a state of delinquency or were closed with a
loss.
– 11.5% of all open FHA loans were in a state of delinquency.
– Comparable figures for VA were 8.5% and 6.9%, respectively;
– Comparable figures for RHS were 6.9% and 6.5%, respectively.
• Benchmarking—Controlling for loan size, geography (state) and
cohort:
– FHA is underperforming. The average delinquency rate of loans with FHA’s mix
of loan size, state and cohort is 7.9% and 6.2%, respectively.
– FHA’s worst performing book year is 2007, with an “excess delinquency rate” of
12.4% above average.
– Newly eligible FHA loans (above old limits) are performing worse than market by
about 4 percentage points. However, this may be market effect—FHA loans in
same markets but below old limits have about the same excess delinquency.
– VA and RHS are performing as predicted (+/- 0.5 percentage points).
National Mortgage Database
20

similar documents