This repository contains the files and a Jupyter notebook required to filter ClinVar entries by gene attributes. The output is a dataframe and plots that reveal the number of missense VUS in genes that are essential to HAP1 cells, T cells and B cells, encode secreted proteins, encode cytoplasmic proteins, encode nuclear proteins, where they localize and if pathogenic varaints are associate with misslocalization, if they are toxic when exogenously expressed, and if they are known to exist in complex with other proteins. To download the ClinVar file, navigate to the FTP site (https://ftp.ncbi.nlm.nih.gov/pub/clinvar/) > tab_delimited/archive/variant_summary_date.txt.gz
All other required files are available for download here.