[EMBOSS] Is vectorstrip gapless by design or is it a bug ?
Jon Ison
jison at ebi.ac.uk
Mon Feb 26 12:10:41 UTC 2007
Hi Charles,
I wasn't sure you'd already got a reply to this so here goes. From a very
quick look at your email ... does it all make sense considering that needle
does an optimal alignment with gaps whereas vectorstip uses a "word-match"
type (ungapped) alignment ... ?
It seems OTT to make vectorstrip do a optimal alignment but I guess its
possible in principle ...
If you've still got a prob. feel free to get back.
Cheers
Jon
> Dear list,
>
> I am using vectorstrip to find PCR primers in cloned PCR products. Strangely,
> in some cases it misses a primer, because it overestimates the number of
> mismatches.
>
> In the following example, vectorstrip identifies the first primer with six
> mismatches, although it has only two. It means that if I run vectorstrip with
> a -mismatch value lower that 29, I do miss the primer.
>
> The following is a mixture of shell commands and extracts of outputs. The
> sequence consists of two reads assembled by using trimseq on .ab1 files, and
> then merger on the resulting fasta files.
>
>
> export
> SEQ="ttttcccccccccnntttttttnnnnncccccnnnnnnnnnaaaaAAccCcTcNCTaTagggCGAGTTggGccCtTCTAGTNtGCATGCtTCGAGcGGcccGccAGTgTTGATGGaTaTCTTGCaGaaTTcGcccTTaaTGAggTAACCgGTTcccAGCaGNttttttttttttttttttttttttttttttttttttttttttttttttttttttttttAaaaaGaaTTGtttattTACTGAACCNgggCAtAtTaGaTACACAACCCATTTTaaaTTTAcATcttttAAtTCaaTtTTGAAgTGttTTTAcAcAcCCNCNCAAaAaaaaaaaaaTTTGGCATGcAACAgCTgGGAACCGTtACCtCATTAAgggCGAAtTCcAGcAcAcTGGCgGCCGTTACtAAGGGATCCGAGCTcGGNACCAAGnnnngnnnnnnnnnnnnnnnnnnttntttnntnnnnaaaaa"
>
> export LINKERA="AATGAGGTAACGGTTCCCAGC"
>
> export LINKERB="GCTGGGAACCGTTACCTCATT"
>
> vectorstrip asis:$SEQ \
> -linkera=$LINKERA \
> -linkerb=$LINKERB \
> -outfile stdout \
> -outseq /dev/null \
> -novectorfile \
> -nobesthits \
> -mismatch 30
>
>
> Sequence: asis Vector: no_name
> 5' sequence matches:
> From 138 to 158 with 6 mismatches
> 3' sequence matches:
> From 351 to 371 with 0 mismatches
> Sequences output to file:
> from 159 to 350
> CaGNtttttttttttttttttttttttttttttttttttttttttttttt
> ttttttttttttAaaaaGaaTTGtttattTACTGAACCNgggCAtAtTaG
> aTACACAACCCATTTTaaaTTTAcATcttttAAtTCaaTtTTGAAgTGtt
> TTTAcAcAcCCNCNCAAaAaaaaaaaaaTTTGGCATGcAACA
> sequence trimmed from 5' end:
> ttttcccccccccnntttttttnnnnncccccnnnnnnnnnaaaaAAccC
> cTcNCTaTagggCGAGTTggGccCtTCTAGTNtGCATGCtTCGAGcGGcc
> cGccAGTgTTGATGGaTaTCTTGCaGaaTTcGcccTTaaTGAggTAACCg
> GTTcccAG
> sequence trimmed from 3' end:
> gCTgGGAACCGTtACCtCATTAAgggCGAAtTCcAGcAcAcTGGCgGCCG
> TTACtAAGGGATCCGAGCTcGGNACCAAGnnnngnnnnnnnnnnnnnnnn
> nnttntttnntnnnnaaaaa
>
> needle asis:$SEQ[138:158] asis:$LINKERA stdout -auto
>
> asis 138 aaTGAggTAACCgGTTcccAG- 158
> |||||||||| ||||||||||
> asis 1 AATGAGGTAA-CGGTTCCCAGC 21
>
>
> Interestingly, in the following aligmnent, the number of mismatches is
> 6. But I did not find anything saying that gaps were disallowed in
> vectorscript ?
>
> aaTGAggTAACCgGTTcccAG
> ||||||||||| | | ||
> AATGAGGTAACGGTTCCCAGC
>
>
> I am using emboss through fink (emboss package 4.0.0-2).
>
> Have a nice day,
>
> --
> Charles Plessy
> http://charles.plessy.org
> Wako, Saitama, Japan
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>
More information about the EMBOSS
mailing list