The QC-App add-on “Generate Data Dictionary” will generate a dbGaP-formatted data dictionary template that you can edit.

  1. Make sure the Data File that you chose earlier is the correct one. You can either use that file or choose a new one.
  2. Enter a comma-delimited list of column names that do not contain custom encoded values that you will need to describe manually. Such columns require less manual editing as they have a simple data type that do not contain encoded values. Example: Age is a non-encoded numeric column, but Gender (male, female, other) is a column with encoded values. In this version of the tool, encoded values that use a terminology like ICD10 are not counted as “encoded” as they do not require manual editing of the data dictionary. For the example data file found here, you would enter “SUBJECT_ID, AGE, ICD9_DIAGNOSIS”.
  3. Add a name for the data dictionary. Choose a name without spaces.
  4. Click “Create Data Dictionary”.
  5. Inspect the rows which correspond to the columns in the data file. If you need to add more non-encoded column names, do it now and click “Create Data Dictionary” again. Repeat 4 and 5 until you only see the correct columns as encoded values.
  6. You will see a table with data dictionary information that requires that you edit the encoded values and the descriptions of the columns wherever there is a “?” symbol with orange background.
  7. Add Units for columns whose values are ambiguous like “age” which can be in years or weeks for infants.
  1. After you edit the data dictionary cells, the table will have no orange colored cells.
  1. Click the Save button to save the data dictionary file.
  2. Download the data dictionary as a CSV file and then move it to the same location as the other files.
  3. Click the back button and upload the file to the QC-App.