Skip to content

Multiple fasta sequences, but of different lengths #93

@SheepwormJM

Description

@SheepwormJM

Hi,

I think I may have an idea of the right answer to my question, but hoping to check I'm not going to do something stupid. I'm wondering if I need to create an multisequence alignment file rather than using a multifasta. My ultimate aim is to produce a haplotype network.

I have a locus which I've amplified, but which has lots of small indels within an intronic part of the sequence.

I have used the following code to load a multifasta file into R, but cannot convert it to a matrix because the sequences are uneven lengths. I can't see any way to get a matrix to load with 'NA' for gaps, and I'm not sure if pegas would subsequently re-align the sequences, or assume they were aligned, if the matrix stuck a lot of NAs on the end of each sequence, rather than internally.

library("apex")
library("adegenet")
library("pegas")
library("mmod")
library("poppr")

# To get a SINGLE fasta file in: 
myseq<-read.FASTA("ASV_multifasta.fa")
myseq # Provides the summary information of the file
2172 DNA sequences in binary format stored in a list.

Mean sequence length: 329.453 
   Shortest sequence: 294 
    Longest sequence: 350 

Labels:
ASV3 BO_04_M
ASV3 BO_04_M
ASV3 BO_04_M
ASV3 BO_04_M
ASV3 BO_04_M
ASV3 BO_04_M
...

Base composition:
    a     c     g     t 
0.282 0.204 0.190 0.324 
(Total: 715.57 kb)
# We need to make it as a matrix:
myseqmatrix<-as.matrix(myseq)

Then I get the error telling me it won't work because the sequences are different lengths.

Error in as.matrix.DNAbin(myseq) : 
  DNA sequences in list not of the same length.

If I make a multifasta file that has a sequence from an Multisequence alignment instead of the sequence itself, would that work for pegas and a haplotype network? Or would it then change the output? Would it even work for the conversion to a matrix?

What do people do with uneven sequence lengths?

Many thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions