The QC process is executed using the FAIRlyz QC toolkit which is installed locally in the compute environment where the data resides and is ready to be analyzed. The current version of FAIRlyz QC requires filenames have no dashes in them.

Review the Previous Step

Install the FAILRYZ QC Tool. Get an IT person to help you install the docker application.

Quality Control (QC) Validation

Use this sample data for testing.

Below you see a screenshot of a QC Tool page after the files were loaded and validated.

Steps

  1. Review that the Data Type shown on top of the page matches your type of data. If not, return to FAIRLYZ and open the correct Data Card.
  2. Required files should have filenames with no dashes in them. You will upload these files:
    • Data File: An Excel or CSV-formatted data file. But only CSV works with the data dictionary generator.
    • Data Dictionary File: A dbGaP-formatted data dictionary, in CSV format. It will be evaluated with the dbGaPCheckup tool.
    • Mapping File: Required for Demographic Data, optional otherwise. A mapping file maps data to BioPortal ontology terms/classes.
  3. Select a Data File, which must be located in the same directory where the QC App installation command was executed.
  4. Upload or Create Data Dictionary: A Data Dictionary File can either be newly generated, or if it exists, it can be selected from the directory where the QC App installation command was executed.
    • Test with example files that include a mapping file with filled information here.
  5. Generate a Data Dictionary if needed by clicking the “+” icon next to the Data Dictionary selection box.
  6. If you prefer stricter rules than dbGap requires, then check the “strict QC” checkbox. Stricter rules require Min, Max and Type columns.
  7. Select date information. This is necessary for age and gender calculations, and for tracking of EHR or clinical events:
    • Select “Same Observation Date for All” when age was collected on the same day
    • Select the column that contains the data associated with the age. This column should contain a normalized date which is the number of days since “Start of study”. You need to have selected a data file before selecting normalized dates. See dbGap Submission Guidelines.
  8. Run “Dictionary QC” first to verify it is correct. This step is required to generate a mapping file.
  9. Review the report and the QC score.
  10. If necessary, correct any errors, then re-run QC to improve your score. 
  11. Upload or Create a Data Mapping File. The Mapping File can either be newly generated, or if it exists, it can be selected from the directory where the QC App installation command was executed.
  12. Generate a Mapping File using OpenAI or manually by editing a template and clicking the “OpenAI Mapping” button. Find the template here. Move the file to the data folder and upload it to the QC App.
  13. Run “Mapping QC” which parses the mapping file to identify proper annotations. For example, for demographics information, it looks that gender, age, race, and ethnicity are provided.
  14. Once you are satisfied, sync the information with the FAIRlyz registry. The data itself is not uploaded or shared. This is the information that is synced:
    • QC score, summary statistics, and QC checks.

Warning

Please only select files that are used for the current study to ensure that the QC score is accurate and not polluted with outlier data. To rerun the command with other files, return to the Data Overview page in FAIRlyz.com, reopen the QC Tool page, replace the files, and run QC again.

Report

You should see a report with the QC score at the bottom. An example is shown below.