This study aims to test and validate different strategies based on the features of the phenotypes analyzed. The objective also involves optimizing datasets, tools, and bioinformatics and statistical approaches aimed at the most effective analysis of genetic data to identify causal and/or predisposing variants of Mendelian diseases and complex traits. Research activity was based on the implementation of standardized and experimental procedures to perform structured genomic analysis of rare and complex diseases and related quantitative traits, using a cohort of individuals extensively characterized at the genetic and phenotypic level from the Sardinian population (SardiNIA cohort). The genetic homogeneity of the Sardinian population facilitated the analysis and allowed to highlight the presence of founder effects and/or peculiar genomic traits causing rare and complex diseases. The accuracy of the clinical data of the SardiNIA cohort and the availability of genetic data distributed over the whole genome for each individual, made possible the screening of several rare and complex phenotypes. Concerning rare diseases, during screenings procedures related to phenotypes involved in age-related diseases, several patients affected by different rare diseases were identified according to specific clinical features. Using the whole-genome approach, we were able to describe the first causal molecular variant of Usher syndrome, its incidence and distribution in Sardinia and the specific molecular features. Collaboration with research institutes and hospitals improved to reach the molecular diagnosis of our patients. Our results reveal that this approach represents an effective and generalizable method to find causal variants in rare and/or Mendelian diseases and to set up large-scale screening programs and/or pre-symptomatic diagnosis in patients and risk groups. In the study of complex diseases, to investigate in depth the molecular mechanisms of complex traits and their regulatory pathways, we systematically integrated association data from public Genome-Wide (GWAS) studies with whole-genome sequence data from the SardiNIA cohort using colocalization analysis. Currently, standardization of colocalization methodologies to examine efficiently the shared association profiles between traits is still a challenge. My experimental approach involved comparing three different software programs, "coloc", "gwas-pw", and "eCAVIAR" to examine and evaluate the impact of each of them on results. Adopting an agnostic approach without biological assumptions, we assessed biological affinity between quantitative traits and diseases to identify novel genetic variants potentially useful in the study of several diseases. Our investigation provided a high percentage of replication of literature findings, clarified uncertain association signals, and identified novel coincident associations between quantitative traits and diseases. Concordance results between the three alternative software were more than 90% with minimal discrepancy. High concordance indicates the reliability of the used algorithms and proves that the specific software does not affect results. This study underlines the scientific contribution of colocalization analysis as a valid methodology to study phenotype-genotype relationships and to identify new susceptibility loci in complex diseases. Our research has adopted and validated several approaches according to peculiar phenotypes starting from a robust genetic database. Biological characterization and functional annotation will make genome analyses reliable and increase our understanding of biological mechanisms of disease susceptibility.

Our research has adopted and validated several bioinformatic approaches and pipelines according to specific phenotypes and genomic features starting from samples of the SardiNIA cohort. The objective aimed at the most effective analysis of genetic data to identify causal and/or predisposing variants of Mendelian diseases and complex traits. During screenings related to phenotypes involved in age-related diseases, several patients affected by rare diseases were identified. Using whole-genome approach, we were able to find the first causal variant of Usher syndrome, its incidence and distribution in Sardinia and the specific molecular features. This approach represents an effective and generalizable method to find causal variants in rare and/or Mendelian diseases and to set up large-scale screening programs and/or pre-symptomatic diagnosis in patients and risk groups. To investigate in depth the molecular mechanisms of complex traits and their regulatory pathways, we systematically integrated association data from public Genome Wide (GWAS) studies with data from the SardiNIA cohort using colocalization analysis. Our investigation compared three different software to evaluate the impact of each of them on results. Adopting an agnostic approach without biological assumptions, we assessed biological affinity between quantitative traits and diseases to identify novel genetic variants potentially useful in the study of several diseases. We provided a high percentage of replication of literature findings, clarified uncertain association signals and identified novel coincident associations between quantitative traits and diseases. We underline the contribution of colocalization analysis to study phenotype-genotype relationships and to identify new susceptibility loci in complex diseases. Biological characterization and functional annotation will make genome analyses reliable and increase our understanding of biological mechanisms of disease susceptibility

Development of reproducible workflows to optimize data-intensive bioinformatics / Rallo, Vincenzo. - (2022 Mar 16).

Development of reproducible workflows to optimize data-intensive bioinformatics

RALLO, VINCENZO
2022-03-16

Abstract

This study aims to test and validate different strategies based on the features of the phenotypes analyzed. The objective also involves optimizing datasets, tools, and bioinformatics and statistical approaches aimed at the most effective analysis of genetic data to identify causal and/or predisposing variants of Mendelian diseases and complex traits. Research activity was based on the implementation of standardized and experimental procedures to perform structured genomic analysis of rare and complex diseases and related quantitative traits, using a cohort of individuals extensively characterized at the genetic and phenotypic level from the Sardinian population (SardiNIA cohort). The genetic homogeneity of the Sardinian population facilitated the analysis and allowed to highlight the presence of founder effects and/or peculiar genomic traits causing rare and complex diseases. The accuracy of the clinical data of the SardiNIA cohort and the availability of genetic data distributed over the whole genome for each individual, made possible the screening of several rare and complex phenotypes. Concerning rare diseases, during screenings procedures related to phenotypes involved in age-related diseases, several patients affected by different rare diseases were identified according to specific clinical features. Using the whole-genome approach, we were able to describe the first causal molecular variant of Usher syndrome, its incidence and distribution in Sardinia and the specific molecular features. Collaboration with research institutes and hospitals improved to reach the molecular diagnosis of our patients. Our results reveal that this approach represents an effective and generalizable method to find causal variants in rare and/or Mendelian diseases and to set up large-scale screening programs and/or pre-symptomatic diagnosis in patients and risk groups. In the study of complex diseases, to investigate in depth the molecular mechanisms of complex traits and their regulatory pathways, we systematically integrated association data from public Genome-Wide (GWAS) studies with whole-genome sequence data from the SardiNIA cohort using colocalization analysis. Currently, standardization of colocalization methodologies to examine efficiently the shared association profiles between traits is still a challenge. My experimental approach involved comparing three different software programs, "coloc", "gwas-pw", and "eCAVIAR" to examine and evaluate the impact of each of them on results. Adopting an agnostic approach without biological assumptions, we assessed biological affinity between quantitative traits and diseases to identify novel genetic variants potentially useful in the study of several diseases. Our investigation provided a high percentage of replication of literature findings, clarified uncertain association signals, and identified novel coincident associations between quantitative traits and diseases. Concordance results between the three alternative software were more than 90% with minimal discrepancy. High concordance indicates the reliability of the used algorithms and proves that the specific software does not affect results. This study underlines the scientific contribution of colocalization analysis as a valid methodology to study phenotype-genotype relationships and to identify new susceptibility loci in complex diseases. Our research has adopted and validated several approaches according to peculiar phenotypes starting from a robust genetic database. Biological characterization and functional annotation will make genome analyses reliable and increase our understanding of biological mechanisms of disease susceptibility.
16-mar-2022
Our research has adopted and validated several bioinformatic approaches and pipelines according to specific phenotypes and genomic features starting from samples of the SardiNIA cohort. The objective aimed at the most effective analysis of genetic data to identify causal and/or predisposing variants of Mendelian diseases and complex traits. During screenings related to phenotypes involved in age-related diseases, several patients affected by rare diseases were identified. Using whole-genome approach, we were able to find the first causal variant of Usher syndrome, its incidence and distribution in Sardinia and the specific molecular features. This approach represents an effective and generalizable method to find causal variants in rare and/or Mendelian diseases and to set up large-scale screening programs and/or pre-symptomatic diagnosis in patients and risk groups. To investigate in depth the molecular mechanisms of complex traits and their regulatory pathways, we systematically integrated association data from public Genome Wide (GWAS) studies with data from the SardiNIA cohort using colocalization analysis. Our investigation compared three different software to evaluate the impact of each of them on results. Adopting an agnostic approach without biological assumptions, we assessed biological affinity between quantitative traits and diseases to identify novel genetic variants potentially useful in the study of several diseases. We provided a high percentage of replication of literature findings, clarified uncertain association signals and identified novel coincident associations between quantitative traits and diseases. We underline the contribution of colocalization analysis to study phenotype-genotype relationships and to identify new susceptibility loci in complex diseases. Biological characterization and functional annotation will make genome analyses reliable and increase our understanding of biological mechanisms of disease susceptibility
genomics; bioinformatics; rare disease; complex traits; colocalization
colocalization
Development of reproducible workflows to optimize data-intensive bioinformatics / Rallo, Vincenzo. - (2022 Mar 16).
File in questo prodotto:
File Dimensione Formato  
Tesi Vincenzo Rallo.pdf

Open Access dal 08/09/2023

Descrizione: Development of reproducible workflows to optimize data-intensive bioinformatics
Tipologia: Tesi di dottorato
Dimensione 7.87 MB
Formato Adobe PDF
7.87 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11388/279535
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact