Clinical_report_based_on_VCF_data_and_ACMG_guidelines

This project outlines the process of generating a comprehensive PDF report that amalgamates genetic variant data from VCF (Variant Call Format) files with clinical interpretations provided by InterVar, a software tool designed for classifying the pathogenicity of variants according to the ACMG (American College of Medical Genetics and Genomics) guidelines. The workflow includes several key steps, beginning with the cloning of the InterVar repository from GitHub, processing variant data through InterVar with ANNOVAR for annotations, and finally, generating a detailed PDF report that includes both genetic variant information and patient-specific details.

Process Overview

Cloning InterVar and Preparing Environment: The process starts with cloning the InterVar tool from its GitHub repository, setting up the environment to include necessary libraries like PyVCF for parsing VCF files, and configuring InterVar to access ANNOVAR for genomic annotations.
Data Processing: Variant data from a VCF file are filtered based on quality and specific criteria. The filtered variants are then annotated using ANNOVAR through InterVar to classify each variant according to ACMG standards.
PDF Report Generation: The report is generated using the reportlab library in Python, including key patient details (sample ID, gender, etc.) derived from the VCF file and the classification results from InterVar. For long sequences, a wrapping function ensures the report's readability by adjusting allele information to fit the PDF format.
Report Customization: Additional patient information such as date of birth, ethnicity, and family history is manually added to personalize the report, providing a comprehensive overview of the genetic analysis.

Libraries and Tools Used

InterVar for interpreting variant pathogenicity
ANNOVAR for genomic annotations
PyVCF and pysam for parsing VCF files
ReportLab for generating PDF reports
Pandas for data manipulation and merging variant information
Python standard libraries (re, sys) for regular expressions and system operations

Future Directions

Optimizing Large PDF Generation: For reports encompassing extensive variant data, the PDF generation process can become time-consuming and memory-intensive. Future improvements could involve optimizing data handling and exploring more efficient ways to render large PDFs, perhaps by paginating data or selectively including only variants of high clinical relevance.
Automating Patient Information Retrieval: Currently, some patient details are manually specified. Automating this process by extracting more information directly from VCF files or integrating with electronic health records could streamline the report generation process.
Interactive Reports: Transitioning from static PDFs to interactive web-based reports could enhance the user experience, allowing clinicians to explore the data more dynamically, filter on-demand, and access additional resources or databases for further information.

This project demonstrates a practical application of bioinformatics tools and libraries to bridge genomic data analysis with clinical utility, ultimately facilitating informed genetic counseling and personalized healthcare.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
README.md		README.md
Source_code_for_ACMG_compliant_clinical_report.ipynb		Source_code_for_ACMG_compliant_clinical_report.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clinical_report_based_on_VCF_data_and_ACMG_guidelines

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Clinical_report_based_on_VCF_data_and_ACMG_guidelines

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages