1 Features

💎 Accurate ANI calculations

Vclust uses a Lempel-Ziv-based pairwise sequence aligner (LZ-ANI) for ANI calculation. LZ-ANI achieves high sensitivity in detecting matched and mismatched nucleotides, ensuring accurate ANI determination. Its efficiency comes from a simplified indel handling model, making LZ-ANI magnitudes faster than alignment-based tools (e.g., BLASTn, MegaBLAST) while maintaining comparable accuracy to the most sensitive BLASTn searches.

📐 Multiple similarity measures

Vclust offers multiple similarity measures between two genome sequences:

ANI: The number of identical nucleotides across local alignments divided by the total length of the alignments.
Global ANI (gANI): The number of identical nucleotides across local alignments divided by the length of the query/reference genome.
Total ANI (tANI): The number of identical nucleotides between query-reference and reference-query genomes divided by the sum length of both genomes. tANI is equivalent to the VIRIDIC's intergenomic similarity.
Coverage (alignment fraction): The proportion of the query/reference sequence aligned with the reference/query sequence.
Number of local alignments: The number of local alignments between the two genome sequences.
Ratio between genome lengths: The length of the shorter genome divided by the longer one.

🌟 Multiple clustering algorithms

Vclust provides six clustering algorithms tailored to various scenarios, including taxonomic classification and dereplication of viral genomes.

Single-linkage
Complete-linkage
UCLUST
CD-HIT (Greedy incremental)
Greedy set cover (adopted from MMseqs2)
Leiden algorithm [optional]

🔥 Speed and efficiency

Vclust uses three efficient C++ tools - Kmer-db, LZ-ANI, Clusty - for prefiltering, aligning, calculating ANI, and clustering viral genomes. This combination enables the processing of millions of virus genomes within a few hours on a mid-range workstation.

🌎 Web service

For datasets containing up to 1000 viral genomes, Vclust is available at http://www.vclust.org.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly