Skip to content

FatineHic/Molecular-Modeling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

🧬 GREB1L Protein Structure Analysis

Comparative structural analysis of the GREB1-like protein (Q9C091) in Homo sapiens using AlphaFold2, Phyre2, and ESMFold.

Python AlphaFold2 PyMOL UniProt


📌 Table of Contents


🔬 Overview

This project focuses on the comparative structural analysis of the GREB1-like protein (Q9C091) in Homo sapiens using three state-of-the-art protein structure prediction methods:

Method Approach
AlphaFold2 Deep learning-based full structure prediction
Phyre2 Homology modeling using known templates
ESMFold Protein language model-based prediction

The goal is to evaluate, compare, and validate predicted 3D protein structures using multiple bioinformatics and structural biology tools.


🧠 Scientific Context

RNA-related proteins and transcription-associated complexes often require accurate structural prediction to understand their biological function.

The GREB1-like protein:

  • Contains 1923 amino acids
  • Has no complete experimental structure available in PDB
  • Requires computational modeling approaches

👉 This project aims to compare different prediction models, evaluate their reliability, and analyze structural consistency.


⚙️ Methods & Workflow

Protein Sequence (GREB1L — Q9C091)
              │
              ▼
┌──────────────────────────┐
│  1. Sequence Analysis    │──→ BLASTp homolog identification
└────────────┬─────────────┘
             │
     ┌───────┼───────┐
     ▼       ▼       ▼
┌────────┐┌────────┐┌────────┐
│Alpha-  ││Phyre2  ││ESM-    │
│Fold2   ││        ││Fold    │
└───┬────┘└───┬────┘└───┬────┘
    │         │         │
    └─────────┼─────────┘
              ▼
┌──────────────────────────┐
│  3. Structural Comparison│──→ RMSD calculation (PyMOL)
└────────────┬─────────────┘
             │
             ▼
┌──────────────────────────┐
│  4. Quality Assessment   │──→ ProQ2, Ramachandran, SAVES
└──────────────────────────┘
             │
             ▼
      ✅ Validated Models

1. Structure Prediction Models

🔹 AlphaFold2

  • Deep learning-based model
  • Predicts full protein structure
  • Provides confidence scores (pLDDT, PAE)

🔹 Phyre2

  • Homology modeling approach
  • Uses known protein templates
  • Confidence based on template similarity

🔹 ESMFold

  • Language model-based prediction
  • Faster but may predict partial structures

2. Sequence Analysis

BLASTp was used to identify homologous sequences. High similarity was found with GREB1-like isoforms (≈100% identity), confirming the biological relevance of the sequence.


3. Structural Comparison

Structures were compared using RMSD (Root Mean Square Deviation) to measure similarity between predicted models. Partial alignment was used for fair comparison across methods with different coverage.


4. Quality Assessment

Multiple tools were used for validation:

Tool Purpose
ProQ2 Predicts global and local model quality
Ramachandran Plot Evaluates backbone conformation (α-helices, β-sheets, outliers)
SAVES Detects steric clashes, B-factor anomalies, side-chain issues

📊 Results Summary

Model Domains Predicted Coverage Strength
AlphaFold2 4 domains Full Best global accuracy
Phyre2 1 domain (~228 aa) Partial Template-based, moderate reliability
ESMFold 1 small domain (~50–60 aa) Very low Best local precision

Key Insights

  • AlphaFold2 → best global model with high confidence regions (pLDDT 60–90)
  • ESMFold → best local precision but limited global coverage
  • Phyre2 → template-dependent, less reliable, several structural inconsistencies

📐 RMSD Analysis

Comparison RMSD (Å) Interpretation
Phyre2 vs AlphaFold2 ≈ 4.16 Moderate deviation, less accurate structurally
ESMFold vs AlphaFold2 ≈ 2.81 Closer locally, higher structural agreement

👉 Lower RMSD = higher structural similarity. ESMFold shows better local agreement with AlphaFold2 than Phyre2.


🔍 Quality Control Findings

Common issues identified across models:

  • Steric clashes (Van der Waals)
  • Abnormal backbone conformations
  • Side-chain optimization issues
  • B-factor inconsistencies

🔵 AlphaFold2 Advanced Metrics

pLDDT (per-residue confidence):

  • High values → reliable regions
  • Low values (<50) → uncertain regions

PAE (Predicted Alignment Error):

  • Shows domain positioning reliability
  • Confirms stable domain regions

📂 Project Structure

greb1l-structure-analysis/
│
├── report/                 # PDF report and figures
│   └── final_report.pdf
│
├── figures/                # Structural visualizations
│   ├── alphafold2/         # AlphaFold2 PyMOL renders
│   ├── phyre2/             # Phyre2 model visualizations
│   ├── esmfold/            # ESMFold predictions
│   ├── ramachandran/       # Ramachandran plots
│   └── rmsd_comparison/    # Structural overlay figures
│
└── README.md               # Project documentation

🚀 How to Reproduce

1. Retrieve protein sequence:

  • Download GREB1L (Q9C091) from UniProt

2. Run prediction tools:

3. Perform analysis:

  • BLAST analysis for homolog identification
  • Structural alignment using PyMOL

4. Evaluate:

  • RMSD calculation between models
  • Ramachandran plot analysis
  • ProQ2 quality scores

🧪 Tools & Technologies

Category Tools
Structure Prediction AlphaFold2, Phyre2, ESMFold
Sequence Analysis BLAST+
Visualization PyMOL
Quality Assessment ProQ2, SAVES
Data Analysis Python

👥 Contributors

  • Fatine Hichami
  • Tugce Koytaviloglu

🎯 Key Learnings

  • Structural prediction models vary significantly in coverage and accuracy
  • Combining multiple methods improves reliability of results
  • Local vs global accuracy must be distinguished when evaluating models
  • Quality control is essential in computational structural biology

📌 Conclusion

This project demonstrates that no single model is sufficient for protein structure prediction.

A combined approach using:

  • AlphaFold2 for global structure
  • ESMFold for local precision

provides a more robust understanding of protein conformation.

About

Comparative structural analysis of the GREB1-like protein (Q9C091) using AlphaFold2, Phyre2, and ESMFold. Includes RMSD comparison, Ramachandran validation, ProQ2 quality assessment, and BLAST homolog analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors