MitoEdit is a novel Python workflow designed for targeted base editing of the mitochondrial DNA (mtDNA). This tool streamlines the selection of optimal targeting windows for base editing, helping researchers efficiently target mtDNA mutations that are linked to various diseases. By enabling targeted base editing of the mitochondrial genome, MitoEdit will speed up the study of mtDNA-related diseases, help with preclinical drug testing, and enable therapeutic approaches to correct pathogenic mutations.
MitoEdit lets users input DNA sequences in text format, specify the target base position, and indicate the desired modification. The tool processes this information to identify candidate target windows, list the number and position of potential bystander edits, and find optimal flanking TALE sequences, where applicable. All the results are provided in a structured format including detailed logs to track progress.
The current MitoEdit workflow established four different pipelines based on the editing patterns observed in the following base editor systems:
A detailed description for each pipeline can be found in the README file in the pipelines folder.
Note: To use the evolved DddA6 variant from the Mok 2022 paper, you can use the Mok2020_G1397
pipeline.
The workflow uses the TALE-NT tool to identify optimal flanking TALE arrays around each candidate target window.
If you have all the necessary tools installed, you can use these commands to access MitoEdit:
python mitocraft.py <position> <reference_base> <mutant_base>
Note: The position
is based on the human mitochondrial genome sequence (NC_012920.1)
python mitocraft.py --input_file <input_DNA_file> <position> <reference_base> <mutant_base>
- Python 3.x
- Required Python packages:
pandas
openpyxl
- Download and install Conda if not already installed. Follow the prompts to complete installation.
git clone https://github.com/Kundu-Lab/mitoedit.git
cd mitoedit
- Use the provided
environment.yml
file to create the Conda environment:
conda env create -f environment.yml
- Alternatively, you can create the conda environment manually:
conda create -n run_talen_env python=2.7.18 biopython=1.70
Note: The Conda environment should be named run_talen_env
for MitoEdit to correctly use the TALE-NT tool. There is no need to activate the conda environment, as the pipeline will automatically use run_talen_env
if it is installed in Conda. This environment is specifically for the TALE-NT Tool.
MitoEdit requires the following parameters:
- The path to a file (.txt / .fasta) containing the DNA sequence.
- If not provided, MitoEdit will use the human mtDNA sequence from NCBI by default.
MitoEdit generates the following outputs:
MitoEdit organizes the output files in the following directories:
- fasta: Contains the FASTA file of the 60bp sequence adjacent to the target base.
- pipeline_windows: Stores target windows generated from each individual pipeline.
- all_windows: Contains a combined list of target windows from all pipelines.
- talen: Stores the output from the TALE-NT tool.
- matching_output: Includes TALE sequences for applicable target windows.
Note: Check the final_output
directory to see the final results. The other directories are stored under the running
directory.
{pipeline}_{position}.xlsx
: Lists the target windows generated from each pipeline.all_windows_{position}.xlsx
: Contains all target windows combined from all pipelines.matching_tales_{position}.xlsx
: Summary of optimal flanking TALE sequences for each applicable target window.
adjacent_bases_{position}.fasta
: Contains the sequence adjacent to the target base, extending 30bp on each side.
TALENT_{position}.txt
: Contains the output from TALE-NT Tool describing the optimal flanking TALE sequences possible.
- For a full list of parameters, use the --help flag from the command line.
python mitocraft.py --help
python mitocraft.py -h
- To run the tool, use the following commands:
python mitocraft.py <position> <reference_base> <mutant_base>
python mitocraft.py --input_file <DNA.txt> <position> <reference_base> <mutant_base>
python mitocraft.py 11696 G A
Expected Output:
When you run this command, MitoEdit generates an Excel file named final_11696.xlsx
in the final_output
directory. This file includes two spreadsheets: All_Windows and Bystander_Effect, with the first five rows from each shown below.
Note: The [ ] represents the target base and { } represent bystander edits.
1. All_Windows Sheet
Pipeline | Position | Reference Base | Mutant Base | Window Size | Window Sequence | Target Location | Number of bystanders | Position of Bystanders | Optimal Flanking TALEs | Flag CheckBystanderEffect |
---|---|---|---|---|---|---|---|---|---|---|
Mok2022_G1397_DddA11 | 11696 | G | A | 14bp | GCA[G]TCATT{C}TCAT | Position 4 from the 5' end | 1 | [11702] | FALSE | - |
Mok2022_G1397_DddA11 | 11696 | G | A | 14bp | CGCA[G]TCATT{C}TCA | Position 5 from the 5' end | 1 | [11702] | FALSE | - |
Mok2022_G1397_DddA11 | 11696 | G | A | 14bp | GCGCA[G]T{C}ATTCTC | Position 6 from the 5' end | 1 | [11698] | FALSE | - |
Mok2022_G1397_DddA11 | 11696 | G | A | 14bp | GGCGCA[G]T{C}ATTCT | Position 7 from the 5' end | 1 | [11698] | FALSE | - |
Note: If the column Flag CheckBystanderEffect=TRUE
, you should manually check the results for potential amino acid changes caused by neighbouring bystanders on the same codon.
2. Bystanders_Effects Sheet
Bystander Position | Reference Base | Mutant Base | Location On Genome | Predicted Mutation Impact | SNV Type | AA Variant | Functional Impact | MutationAssessor Score |
---|---|---|---|---|---|---|---|---|
11698 | C | T | Complex 1 | Predicted Benign | synonymous SNV | V313V | ||
11702 | C | T | Complex 1 | Predicted Pathogenic | nonsynonymous SNV | V315F | medium | 3.44 |
11704 | C | T | Complex 1 | Predicted Benign | synonymous SNV | V315L |
python mitocraft.py --input_file test.txt 33 G A
Expected Output: When using an input file, the generated Excel file will contain only one spreadsheet, similar to the following: (example taken from the file provided in the test file folder)
1. All_Windows Sheet
Pipeline | Position | Reference_Base | Mutant Base | Window Size | Window Sequence | Target Location | Number of bystanders | Position of Bystanders | Optimal Flanking TALEs | Flag_CheckBystanderEffect | LeftTALE1 | RightTALE1 | LeftTALE2 | RightTALE2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Mok2020_G1397 | 33 | G | A | 14bp | TG{G}[G]A{G}AACT{C}TCT | Position 4 from the 5' end | 3 | [32, 35, 40] | FALSE | TRUE | ||||
Mok2020_G1397 | 33 | G | A | 14bp | CTG{G}[G]A{G}AACTCTC | Position 5 from the 5' end | 2 | [32, 35] | FALSE | TRUE | ||||
Mok2020_G1397 | 33 | G | A | 14bp | ACTG{G}[G]AGAACTCT | Position 6 from the 5' end | 1 | [32] | FALSE | TRUE | ||||
Mok2020_G1397 | 33 | G | A | 14bp | TACTG{G}[G]AGAACTC | Position 7 from the 5' end | 1 | [32] | TRUE | TRUE | T TACCCCCCACTATTAACC | TCTGTGCTAGTAACC A | T ACCCCCCACTATTAACC | TCTGTGCTAGTAACC A |
Note: When optimal flanking TALE sequences are found, the sequence is added to the LeftTALE
and RightTALE
columns respectively. The impact of bystander edits is not provided when using an input DNA file.
- Input File Formatting: Ensure that your input file is correctly formatted, with the reference base matching the base at the specified position.
- Supported File Formats: MitoEdit can read
.fasta
and.txt
formats for DNA sequences. - Minimum Upload Sequence Length: The input file must contain at least 35 bases, covering the target base on either side, for accurate processing.
- Output File Name Conflicts: Check for existing output files with the same name before running MitoEdit to prevent overwriting and errors.
- Logging: MitoEdit logs its progress and any issues encountered during execution in
logging_main.log.
- Species Support: While the tool is designed primarily for human mtDNA, other DNA sequences can also be uploaded and used.
- Modifying TALE-NT Workflow: If no matching flanking TALE sequences are identified, consider modifying the TALE-NT parameter by setting
FILTER = 2
. This will identify all TALE pairs targeting any base in the target window, not just those for the target base. For further information, refer to the TALE-NT FAQs.
If you use MitoEdit in your research, please cite it as follows:
- {paper link!}
If you have any questions, please do not hesitate to contact me at:
- Devansh Shah: [email protected], [email protected]
This project is licensed under the MIT License. See the LICENSE
file for details. You are free to use and modify it, but please give credit to the original authors.
We welcome contributions! If you want to help improve MitoEdit, please fork the repository and submit a pull request.
Copyright (c) 2011-2015, Nick Booher [email protected] and Erin Doyle [email protected].
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE