FAIRification and semantic modelling for Duchenne and Becker Muscular Dystrophy rare diseases

Pablo Perdomo-Quinteiro, Sergiu Siminiuc, Paraskevi Sakellariou, Marco Roos, Pietro Spitali, Núria Queralt-Rosinach 

Lay Summary

The poster presented at the SWAT4HCLS International Conference 2023 [1] discusses ongoing FAIRification efforts within the BIND project to enhance the characterization of brain involvement in Duchenne and Becker Muscular Dystrophies (DMD and BMD). The BIND project involves collaboration among 19 organizations across Europe and Japan to collect phenotypic and molecular data related to DMD and BMD. This includes data such as transcriptomic, proteomic, clinical, behavioral, and MRI brain images. The primary aim is to improve the accessibility and (re-)usability of this data by making it Findable, Accessible, Interoperable, and Reusable (FAIR) [2]. This ensures that the data can be effectively exchanged, integrated, and analyzed both within the consortium and with other external FAIR data sources. 

The FAIRification process involves properly describing and annotating the data using standard metadata models and ontologies recognized by the DMD and BMD research community. Semantic Web technologies such as RDF, OWL, and ShEx are utilized for FAIRification, along with community standards such as semantic models provided by the European Joint Programme on Rare Diseases (EJP RD). This enables the BIND data to be made available in a machine-readable way by storing them in a persistent and accessible repository. 

To facilitate the FAIRification process, Bring Your Own Data (BYOD) workshops [3] have been organized, where data owners collaborate with FAIR experts to assess the FAIR status of the data and understand the FAIRification methodology. Challenges encountered during this process include ensuring mutual understanding between data owners and FAIR experts, as well as reaching a consensus on data representation. The semantic models developed for the project adhere to the semantic core data model provided by the EJP RD for rare disease patient registries [4]. 

The ongoing work involves making multimodal data FAIR and developing semantic models to represent various types of data, including patient phenotypic behavioral information and single-cell RNA-seq data obtained from preclinical mouse models. This initiative aims to improve data accessibility, interoperability and re-usability by the (rare disease) community, ultimately advancing research on DMD and BMD and facilitating the development of new therapeutic strategies.  

[1] https://www.swat4ls.org/workshops/basel2023/ 

[2] Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18 

[3] C.H. Bernabé et al. “Building expertise on FAIR through evolving Bring Your Own Data (BYOD) workshops: describing the data, software, and management- focused approaches and their evolution”. Data Intelligence (2023). https://doi.org/10.1162/dint_a_00236 

[4] Kaliyaperumal, R., Wilkinson, M.D., Moreno, P.A. et al. Semantic modelling of common data elements for rare disease registries, and a prototype workflow for their deployment over registry data. J Biomed Semant 13, 9 (2022). https://doi.org/10.1186/s13326-022-00264-6