The QC-App add-on “Generate Data Dictionary” will generate a dbGaP-formatted data dictionary template that you can edit.
- Make sure the Data File that you chose earlier is the correct one. You can either use that file or choose a new one.
- Enter a comma-delimited list of column names that do not contain custom encoded values that you will need to describe manually. Such columns require less manual editing as they have a simple data type that do not contain encoded values. Example: Age is a non-encoded numeric column, but Gender (male, female, other) is a column with encoded values. In this version of the tool, encoded values that use a terminology like ICD10 are not counted as “encoded” as they do not require manual editing of the data dictionary. For the example data file found here, you would enter “SUBJECT_ID, AGE, ICD9_DIAGNOSIS”.
- Add a name for the data dictionary. Choose a name without spaces.
- Click “Create Data Dictionary”.
- Inspect the rows which correspond to the columns in the data file. If you need to add more non-encoded column names, do it now and click “Create Data Dictionary” again. Repeat 4 and 5 until you only see the correct columns as encoded values.
- You will see a table with data dictionary information that requires that you edit the encoded values and the descriptions of the columns wherever there is a “?” symbol with orange background.
- Add Units for columns whose values are ambiguous like “age” which can be in years or weeks for infants.
- After you edit the data dictionary cells, the table will have no orange colored cells.
- Click the Save button to save the data dictionary file.
- Download the data dictionary as a CSV file and then move it to the same location as the other files.
- Click the back button and upload the file to the QC-App.