Skip to content
markschl edited this page Aug 14, 2018 · 5 revisions

Trims sequences to a given range.

Usage:
  st trim [options][-a <attr>...][-l <list>...] <range> [<input>...]
  st trim (-h | --help)
  st trim --help-vars

Options:
    <range>             Range in the form 'start..end' or 'start..' or '..end',
                        Variables containing one range bound or the whole range
                        are possible.
    -e, --exclude       Exclusive trim range: excludes start and end positions
                        from the output sequence.
    -0                  Interpret range as 0-based, with the end not included.

See this page for the options common to all commands.

Description

The trim ranges are 1-based, using negative numbers means that the number is relative to the sequence end (see the explanation of ranges with basic examples).

Example bash commands for removing primers from the ends:

f_primer=GATGAAGAACGYAGYRAA
r_primer=TCCTCCGCTTATTGATATGC

st trim -- "${#f_primer} .. -${#r_primer}" input.fa > output.fa

Note: Since the last primer base should not be included, we use the -e/--exclude option.

Using variables

The command becomes very useful with variables. The following is equivalent to bedtools getfasta (note that the BED format is 0-based, thus the -0 option):

st trim -l coordinates.bed -0 {l:2}..{l:3} input.fa > output.fa
# instead of -0 we could also use a math expression:
st trim -l coordinates.bed "{{l:2 + 1}}..{l:3}" input.fa > output.fa

It is also possible to directly use the output of the find command, e.g. if looking for primers:

input.fa (N is a placeholder):

>id
NGATGAAGAACGYAGYRAANNNNNNNNNNNNNNNNNNNTCCTCCGCTTATTGATATGCN

Looking for primers, storing the result in attributes using the f:drange variable (for dot range):

f_primer=GATGAAGAACGYAGYRAA
r_primer=TCCTCCGCTTATTGATATGC

st find -d4  --rng ..23 -p f_end={f:end} -ad4 $f_primer input.fa \
  | st find  --rng " -23.." -p r_start={f:neg_start} -d4 $r_primer \
  > primer_search.fa

primer_search.fa:

>id f_end=19 r_start=-21
NGATGAAGAACGYAGYRAANNNNNNNNNNNNNNNNNNNTCCTCCGCTTATTGATATGCN

Now we can use this range for trimming:

st trim -e "{a:f_end}..{a:r_start}" primer_search.fa > no_primers.fa

no_primers.fa:

>id f_end=19 r_start=-21
NNNNNNNNNNNNNNNNNNN

Clone this wiki locally