Data Quality
Data needs (Variables)
Planning (Proposal)
Forms
Data collection
Raw data
Data management
Clean (error-free) data
Data analysis
Information
Data presentation
Results
Raw data
Data management
Clean (error-free) data
Data analysis (+ stat test)
Information
Data presentation
Results
Quality of data
Planning the data needs of a study
Data collection
Data management (data processing)
Data Analysis
Data Management
(raw data in forms clean data)
- coding, data entry, checking
- recoding missing values, creating new
variables
- file merging/splitting, file exporting
Data Management
(raw data in forms clean data)
- coding, data entry, checking
- recoding missing values, creating new
variables
- file merging/splitting, file exporting
Main Types of Errors in Data
Activities
Filling error handwriting, missing value
Copying error 17 56 0o
Transposition 3993
Coding error 12 02
Routing error Answers in the wrong order
Range error 3 in (1,2) 35 yr in(15-24yr)
Consistency Male , Parity - 1
error
Checking of Data Quality
Manual checking (during data collection)
– by interviewer on completion of each interview
– by supervisor on each day of data collection
– by researcher periodically on samples
Interactive checking (during data entry)
Double entry and validation (after data entry)
Batch checking (after data entry)
Correction of Errors
Raw Data Checking Clean data
Correction Errors
Omitting
Prevention of Errors
Good questionnaire design
Pre-coding
Consistent coding system
Training of interviewers & supervisors
Training of data entry staff
Collection and transfer of data collection forms
Checking of error at all phases of study
– Checking during data collection
– Checking at data entry
– Checking after data entry
Plan for Data Management
Number and type of data collection forms Quality
Pre-coding/coding after data collection
Training of interviewers, supervisors and data entry staff
Manual/computer data entry (Epidta, EpiInfo, Access, Excel)
Collection and transfer of data collection forms
Checking for Data Quality
- Checking of errors during data collection
- Checking of errors at data entry
(check computer program & double entry)
- Checking of errors after data entry
(Cross-tab for all categorical variables &
summary statistics for all numerical variables
Summary
To produce Clean Data ready for Analysis
Checking of Errors at all steps of study