The QC process is executed using the FAIRLYZ QC App which is installed locally in the compute environment where the data resides and is ready to be analyzed. The current version of FAIRLZY QC requires that filenames have no spaces in them.

Review the Previous Step

Install the FAILRYZ QC App. Get an IT person to help you install the docker application.

Quality Control (QC) Validation

Use this sample data for testing.

Below you see a screenshot of a QC App page after the files were loaded and validated.

  1. Review that the read-only checkbox matches your type of data. If not, return to FAIRLYZ and open the correct Data Card.
  2. Required files should have filenames with no spaces in them. You will upload these files:
    • Data File: An Excel or CSV-formatted data file. But only CSV works with the data dictionary generator.
    • Data Dictionary File: A dbGaP-formatted data dictionary, in CSV format
    • Mapping File: Required for Demographic Data, optional Otherwise. A mapping file maps data to BioPortal ontology terms/classes.
  3. Select a Data File, which must be located in the same directory from where the QC App installation command was executed.
  4. A Data Dictionary File and a Data Mapping File can either be newly generated, or if they exist, they can be selected from the directory where the QC App installation command was executed.
    • Test with example files that include a mapping file with filled information here.
  5. Generate a Data Dictionary if needed by clicking the Pencil icon next to the Data Dictionary selection box.
  6. If you prefer stricter rules than dbGap requires, then check the “strict QC” checkbox.
  7. Select date information. This is necessary for age and gender calculations, and for tracking of EHR or clinical events:
    • Select “Same Observation Date for All” when age was collected on the same day
    • Select the column that contains the data associated with the age. This column should contain a normalized date which is the number of days since “Start of study”. See dbGap Submission Guidelines.
  8. Run “Dictionary QC” first to verify it is correct and can be used to generate a Mapping
  9. Review the report and the QC score.
  10. If necessary, correct any errors, then re-run QC to improve your score. 
  11. Generate a Mapping File using OpenAI or manually by editing a template and clicking the “OpenAI Mapping” button. Find the template here. Move the file to the data folder and upload it to the QC App.
  12. Run “Mapping QC” which parses the mapping file for completeness. For example, for demographics information, it looks that gender, age, race, and ethnicity are provided.
  13. Once you are satisfied, sync the information with the FAIRLYZ registry. The data itself is not uploaded or shared. This is the information that is synced:
    • QC score, summary statistics, and QC checks.

Warning

Please only select files that are used for the current study to ensure that the QC score is accurate and not polluted with outlier data. To rerun the command with other files, replace the files, and run QC again.