Process |
Expected time for completion |
Deadline |
Data specification and
request
- Addition of new variables / codes to data specifications
- Distribution of data specifications and request for data
to centres.
The request is for data for all patients ever seen for care
at a centre
|
6 weeks |
19/12/2008 |
Checking general integrity
of data from sites (not of individual records)
- Recode/manipulate centre-specific data so that it conforms
with data specifications
- Tabulate all variables to identify obvious errors
- Check of new data against previous data submission for
each centre
- Resolve any problems arising from above with local data
manager
- Clean ART data from each centre via SAS program: includes
removal of overlapping datees and short breaks (<14d)
|
3.5 months |
20/03/2009 |
Import of data into database
- Prepare data files as tab delimited text files, with dates
in dd/mm/yyyy format, and include leading zero's
- Import data into database
|
3 weeks |
09/04/2009 |
Data cleaning of individual
records
- Print out within-centre data queries and consistency checks
using pre-existing queries and reports at MRC
- Liaise with data managers/assistants/research nurse or
visit centre to resolve errors
- Individual edits applied to the database (maintains audit
trail)
|
2 months |
05/06/2009 |
Notify centres:
- send summary of centre-specific data and the results of
the cleaning process i.e how many inconsistencies identified
and resolved
- feedback results of data checks for centres to update their
local database
|
1 week |
12/06/2009 |
De-duplication
- Run de-duplication program
- Manually resolve any ambiguous matches
- Merge demographic, AIDS, ART and other data files
|
6 weeks |
31/07/2009 |
Data export
- Export data as text files to study statistician
- Final SAS manipulation of files (by study statistician)
to resolve outstanding data quirks
- Export selelcted data to other statisticians as required
- Notify sites that they may have an export of their own (cleaned)
data if required
|
1 month |
31/08/2009 |
| Total time for dataset prepation |
9 months |
|