MICS Data Processing Workshop

Multiple Indicator Cluster Surveys
Data Processing Workshop
Overview of Data Processing
MICS Data Processing Workshop
Content of the Presentation
• Overview of the MICS data processing system
• Data processing using paper questionnaires
• Main characteristic of the MICS CAPI system
• Creating analysis files
• Data archiving
Content of the Presentation
• Overview of the MICS data processing system
MICS Data Processing System:
Actors and Roles
– Country data processing manager and country team:
• Customization of data entry programs, data entry, editing, and
production of datasets
• Customization of tabulation syntaxes and tabulation
– Regional Office MICS Coordinator
• Coordination and supervision, organization of the Data Processing
– Regional Office Data Processing Consultant
• Technical support and review of customized programs and close
work with country teams
– HQ Data processing unit
• Development of standard programs, templates and coordination of
Data Processing workshops
MICS Data Processing System
• The data-processing system can be divided
into following phases:
– Customization of MICS data entry/collection
program and tabulation syntaxes,
– Establishing the data entry system locally,
– Primary data processing (data entry/data collection),
– Secondary data processing (creating analysis files),
– Tabulation
– Archiving
Content of the Presentation
• Overview of the MICS data processing system
• Data processing using paper questionnaires
MICS Data Processing System
• Designed to deliver the first results of a survey within several
weeks after the end of fieldwork
• Such rapid turnaround time is possible when completed
questionnaires are entered simultaneously with survey
• Data for each cluster is stored in a separate data file and is
processed as soon as all the questionnaires from a cluster are
returned from the field
• This approach breaks data processing down into discrete
segments and allows it to progress while fieldwork is ongoing
• By the time the last questionnaires are finished and returned
to headquarters, most of the data have already been
Primary Data Processing Flow
Main Data Entry
Structure Check
Verification Data Entry
Difference Listing
Backup Raw Data
Secondary Editing
Backup Final Data
Primary Data Processing
• Main data entry
– First time data is entered
• Structure check
– Checks structure of data files
• Verification data entry
– Second time data is entered
• Difference listing
– Two data files are compared; differences resolved
Primary Data Processing
• Raw data backup
– Verified data are backed up to a directory
containing raw data files
• Secondary editing
– Complex inconsistencies are investigated and
resolved if able
• Final data backup
– Edited data are backed up to a directory
containing edited data files
Data Processing Personnel
Questionnaire administrators
Data entry operators
Secondary editors
Data processing supervisor
Questionnaire Administrators
Receive clusters from the field
Check that all questionnaires are present
Check that questionnaires are ready to enter
Check that HH and individual questionnaires are in
the proper order, and rearranges them if not
• Keep track of location of all clusters
• Should follow interviewer training
Data Entry Operators
• Enter main data
• Enter verification data
• Resolve differences between files
• Must follow interviewer training
• Must be familiar with the questionnaires
Secondary Editors
• Investigate complex inconsistencies
• Tell supervisor if and how to resolve
• Review editing guidelines
• Must be present during interviewer training
• Need excellent understanding of questionnaire
and goals of survey
Data Processing Supervisor
• Resolves data entry problems
• Maintains programs
• Oversees entire data processing system
• Must be present during interviewer training
• Must have excellent grasp of questionnaire
• Must have programming skills in SPSS and CSPro
Questionnaire Administrator Training
• Review list of checks in data processing
• Give QA several clusters and check work
• Establish questionnaire storage procedures
Data Entry Training
• Begin when you have one cluster for each data entry
• Allows you to
– Train data entry operators
– Debug programs
• Practice verification at the same time
• When you have finished
– Fix entry programs
– Delete data files
Secondary Editor Training
• Wait until you have 3-4 clusters double-entered
• Give secondary editors
– Copy of editing guidelines
– A cluster’s error listing
– The cluster’s questionnaires
• Review work with secondary editor
• Try to schedule a day when data entry operators
aren’t working
Data Processing Equipment
• Data entry machines
Windows XP, 2000, Vista,7 or 8
Supervisor’s machine
Windows XP, 2000, Vista,7 or 8
Available disk drive space for all software and
MICS data files
• Uninterrupted power supplies (UPS)
• Network
• Surge Protectors
Data Processing Equipment
A printer
Toner cartridges/printer ribbons
Flash drives (if network is not established)
Green pens
Data Processing Rooms
• Data Entry
– Desk space for the monitor, keyboard, and stack
of keyed and unkeyed questionnaires
– Area to store “in-progress” clusters
• Editing
– Quiet space for editors to work
• Questionnaires
– Must contain means (e.g., shelving) to organize
questionnaires by cluster
Server and Keyer’s Directory Structure
DATA contains data from main entry
ENTRY contains data entry programs
VERI contains data from verification entry
Supervisor’s Directory Structure
backup of verified data
dictionaries used in all programs (not just entry)
contains data entry programs
programs to transfer data from CSPro to SPSS
a copy of all edited data
GPS data entry program (if applicable)
contains all unedited data (just after differences resolved)
all programs not related to data entry phase
contains all SPSS programs used for export and tabulation
contains weights spreadsheet and data files
Data Entry
reaches the
Cluster acceptance is done.
All the questionnaires are
counted and recorded
according to their type and
result code to the cluster
tracking form (CTF).
Supervisor verifies the
main and verification
data by the supervisor
menu. If there is typing
errors, lets the keyers
correct them at their
computers. If no
problem, the raw data is
backed up. Sytem
automatically gives
date, it should be
written to CTF
Cluster is
sent to
Supervisor records the
information on CTF to the
system. Cluster is ready for
main data entry. Supervisor
gives it to first keyer and
records keyers information to
system and to CTF.
First keyer enters the
whole of the
questionnaires of the
enters the whole of the
questionnaires of the cluster.
Supervisor runs the structure
check control. If all
questionnaires are entered,
gives to second keyer for
verification and records this
information to system and
After backing up raw data
After backing up of
raw data: Secondary
editing is done by the
supervisor menu.
“Back up the final data”
Supervisor follows the
same procedures for
every cluster
If all procedures are
completed for all
clusters: Export the
data to SPSS
Secondary Editor: By
using the “Editing
Manual “ the necessary
changes at the data with
“Modify the data”
option at Supervisor
menu is done.
Network Computer:
Keyer Computer:
Supervisor Computer:
Content of the Presentation
• Overview of the MICS data processing system
• Data processing using paper questionnaires
• Main characteristic of the MICS CAPI system
Mobile data collection
• In the recent years we have seen
development of many innovative data
collection tools using handheld PocketPC
personal digital assistants (PDAs),
smartphones or tablet personal computers.
Tablet Personal Computer
• Tablet – sized computer with the key features
of a full – size personal computer
• Various operating systems
• Computer vs. cell
Personal Digital Assistants
• Handheld computers
• Various operating systems
– Windows mobile
– Palm
– Others (e.g. iPhone, Nokia)
• Cell phone capability
– Plus: good communications
– Minus: security
Tablet vs PDA
• Choosing a tablet over a PDA
- Tablets have much larger screens, better resolutions, more
space for on screen typing making things easier to see and
more useful for data entry
- Tablets and PDAs have similar battery life, though if PDAs are
used for making and receiving calls battery will need more
frequent charging
- Better security
Hardware requirements for CSPro
• Required configuration:
Full Microsoft Windows 7 or 8 (NO windows RT tablets)
• Required configuration:
Windows Mobile versions 5 and 6
(note that UNICODE is not supported)
• All applications to collect and administer data are
written in CSPro
• Three systems:
– Interviewers - data collection – Tablet/PDA
– Supervisors – data monitoring and control –
– Central Office - centralized data and monitor
fieldwork - PC
Repair Utility
Int 5
Int 2
Central Office
Central Office
• 2 weeks training (paper questionnaires)
including 3 days field exercise
+ 1 week extra training (PDAs/Tablets)
+ 4-5 days extra field exercise
• High Level Technical Support
Directories on Interviewer's Tablet
\My Documents\MICS5\
Data Data from the questionnaires as collected during the interview
Dicts Dictionaries describing the data structures
Entry Questionnaire and Interviewer’s menu programs
Ref Reference data files used by the programs
Utility Utility programs used for Bluetooth transfer, compressing files,
upgrading programs, etc.
• Work Working files used by the menu and programs
\SD Card\
• Backup Backup copy of the data files from Data created after the
Directories on Supervisor’s Tablet
\My Documents\MICS5\
• Receive
• Temp
• Work
• Reports
Directory for data received from the interviewers
Temporary directory for working files transferred from
interviewers, prior to transferring to the Receive directory
Working data files
Report files created by supervisor applications
Content of the Presentation
• Overview of the MICS data processing system
• Data processing using paper questionnaires
• Main characteristic of the MICS CAPI system
• Creating analysis files
Secondary Data Processing
• Exporting data from CSPro
– Create SPSS data file and syntax file from CSPRO
data file and dictionary
• Importing data into SPSS
– Executing syntax file created by CSPro
• Recoding variables
– Creating new variables and recoding old variables
Secondary Data Processing
• Adding sample weights
– Sample weights are added from weights
• Adding wealth index
– Wealth index calculated then added to files
• Adding GPS data
– Geographic location data added to files
• Tabulation
– Tables are generated from the analysis files
Content of the Presentation
• Overview of the MICS data processing system
• Data processing using paper questionnaires
• Main characteristic of the MICS CAPI system
• Creating analysis files
• Data archiving
Data archiving: rationale
• Collecting data is expensive. Data should be used
beyond producing basic report.
• Survey microdata are valuable resources for
government departments and academic researchers.
• Survey data constitute valuable and irreplaceable
assets which should be managed in a way that
encourages their widest possible use and re-use.
• At the same time, data collectors main focus should
be protecting respondents while making microdata

similar documents