The QC Tool add-on “Data Dictionary Designer” will generate a dbGaP-formatted data dictionary template that you can edit.

FAIRlyz Data Dictionary Designer

  • Focus: Generate or edit a data dictionary to comply with the dbGaP and dbGaPCheckup formats
  • Adds and supports these fields: VARNAME, VARDESC, UNITS, TYPE, VALUES
  • Encoded Values: For TYPE=encoded value, it expects a list of codes, each code followed by an equal sign and the description of the code.All values in the VALUES column follow the VALUE=MEANING format (e.g., 0=Yes, 1=No).
  • For Non-encoded Values: The Values field is left empty. NA indicates missing values in R, while an empty string (“”) might still be processed as text.
  • Missing information: Missing information is marked with a question mark “?” and an orange background. 
  • File format when saved: Saves fields with a question mark “?” as empty strings.

See handling of encoded values in this guide.

Steps

  1. Make sure the Data File that you chose earlier is the correct one. You can either use that file or choose a new one.
  2. Enter a comma-delimited list of column names that do not contain custom encoded values that you will need to describe manually. Such columns require less manual editing as they have a simple data type that do not contain encoded values. Example: Age is a non-encoded numeric column, but Gender (male, female, other) is a column with encoded values. In this version of the tool, encoded values that use a terminology like ICD10 are not counted as “encoded” as they do not require manual editing of the data dictionary. For the example data file found here, you would enter “SUBJECT_ID, AGE, ICD9_DIAGNOSIS”.
  3. Add a name for the data dictionary. Choose a name without spaces.
  4. Click “Create Data Dictionary”.
  5. Inspect the rows which correspond to the columns in the data file. If you need to add more non-encoded column names, do it now and click “Create Data Dictionary” again. Repeat 4 and 5 until you only see the correct columns as encoded values.
  6. You will see a table with data dictionary information that requires that you edit the encoded values and the descriptions of the columns wherever there is a “?” symbol with orange background.
  7. Add Units for columns whose values are ambiguous like “age” which can be in years or weeks for infants.
  1. After you edit the data dictionary cells, the table will have no orange colored cells.
  1. Click the Save button to save the data dictionary file.
  2. Download the data dictionary as a CSV file and then move it to the same location as the other files.
  3. Click the back button and upload the file to the QC-App.