The FAIRLYZ Data QC score is a combined percentage reflecting the Findability, Accessibility, Interoperability, and Analyzability of data. Researchers’ feedback on data reuse is incorporated into a separate Reusability score.
25% for F for Findable:
Every Researcher gets at least 25% for the data they register in FAIRLYZ.
FAIRLYZ makes data information findable by complying with unique IDs or persistent IDs as per F1, and F2. (Supported by FAIR-Checker)
Metadata clearly and explicitly includes the identifier of the data it describes, as in F3, and (meta)data are registered or indexed in a searchable resource, as in F4
Searching will be implemented using terms from shared ontologies as in this interoperable principle.
25% for A for Accessible
The Researcher provides URLs for the location of his/her data with the appropriate data access requirements. This complies with A1.1 and A1.2 (Supported by FAIR-Checker).
It also complies with A2 but only after a study is published in a journal and the URL provided and verified: Metadata are accessible, even when the data are no longer available as per A2.
25% for I for Interoperable
The FAIRLYZ-QC process complies with the interoperable FAIR principle by checking that the data uses a machine-readable format as in I1 (Supported by FAIR-Checker), and is annotated with terms from shared ontologies as in I2 (Supported by FAIR-Checker), and I3.
FAIRLYZ extends interoperability from being machine-readable to being integration-ready. It evaluates whether the data can be integrated with other data by checking for clean formatting, minimal missing values, and rich annotations.:
- Genomic Data: Genomic Data QC verifies the presence of essential omics data files.
- subject, phenotype, sample-subject, and genomics files.
- General Data, and Demographic Data: The QC process ensures general data quality by verifying the presence of a descriptive data dictionary and mapping terms to standard ontologies.
The FAIRLYZ QC Interoperability score is an average weighted percentage, where each data type’s score is based on the number of passed checks divided by the total checks for a 25% weight.
5-Star Rating for R for Reusable
FAIRLYZ uses a different method to evaluate reusability than the ones espoused by the R1 principles and supported by FAIR-Checker. While the FAIRLYZ Data QC score can reach 100% upon data contributor completion of data annotation and data quality control, it excludes reusability considerations.
Instead, in FAIRLYZ, users of the data provide a reusability rating. The Reusability score is provided by a 5-star rating system.
At least 3 users (who have reused data) have to provide a rating for the star rating to be displayed.
Users are asked to provide a rating on 4 FAIRLYZ criteria with the 4th criteria being a combination of the reusability and the analyzable criteria. FAIRLYZ stores individual ratings, which the researcher, owner of the data, can inspect, but these are combined into an average rating as shown in the homepage.
The user is presented with these 4 rating criteria:
- Rate how easy it was for you to find the data
- Rate how easy it was for you to access the data
- Rate the data based on interoperability: how well-annotated and complete it is
- Rate the data on reusability: how easy was it to reanalyze or analyze for a new study
25% for LYZ in AnaLYZable
This criterion is unique to FAIRLYZ and not supported by FAIR principles.
Researchers provide the URLs for the software used on their data, as well as the settings, and configurations used for the software. Researchers also provide QC reports from omics pipelines, such as MultiQC reports. With this information, other researchers should be able to reuse the data and redo the analysis.