[EMBOSS] Is vectorstrip gapless by design or is it a bug ?

Jon Ison jison at ebi.ac.uk
Mon Feb 26 12:10:41 UTC 2007


Hi Charles,

I wasn't sure you'd already got a reply to this so here goes.  From a very
quick look at your email ... does it all make sense considering that needle
does an optimal alignment with gaps whereas vectorstip uses a "word-match"
type (ungapped) alignment ... ?

It seems OTT to make vectorstrip do a optimal alignment but I guess its
possible in principle ...

If you've still got a prob. feel free to get back.

Cheers

Jon


> Dear list,
>
> I am using vectorstrip to find PCR primers in cloned PCR products. Strangely,
> in some cases it misses a primer, because it overestimates the number of
> mismatches.
>
> In the following example, vectorstrip identifies the first primer with six
> mismatches, although it has only two. It means that if I run vectorstrip with
> a -mismatch value lower that 29, I do miss the primer.
>
> The following is a mixture of shell commands and extracts of outputs. The
> sequence consists of two reads assembled by using trimseq on .ab1 files, and
> then merger on the resulting fasta files.
>
>
> export
> SEQ="ttttcccccccccnntttttttnnnnncccccnnnnnnnnnaaaaAAccCcTcNCTaTagggCGAGTTggGccCtTCTAGTNtGCATGCtTCGAGcGGcccGccAGTgTTGATGGaTaTCTTGCaGaaTTcGcccTTaaTGAggTAACCgGTTcccAGCaGNttttttttttttttttttttttttttttttttttttttttttttttttttttttttttAaaaaGaaTTGtttattTACTGAACCNgggCAtAtTaGaTACACAACCCATTTTaaaTTTAcATcttttAAtTCaaTtTTGAAgTGttTTTAcAcAcCCNCNCAAaAaaaaaaaaaTTTGGCATGcAACAgCTgGGAACCGTtACCtCATTAAgggCGAAtTCcAGcAcAcTGGCgGCCGTTACtAAGGGATCCGAGCTcGGNACCAAGnnnngnnnnnnnnnnnnnnnnnnttntttnntnnnnaaaaa"
>
> export LINKERA="AATGAGGTAACGGTTCCCAGC"
>
> export LINKERB="GCTGGGAACCGTTACCTCATT"
>
> vectorstrip 	asis:$SEQ \
> 		-linkera=$LINKERA \
> 		-linkerb=$LINKERB \
> 		-outfile stdout \
> 		-outseq /dev/null \
> 		-novectorfile \
> 		-nobesthits \
> 		-mismatch 30
>
>
> Sequence: asis   Vector: no_name
> 5' sequence matches:
>         From 138 to 158 with 6 mismatches
> 3' sequence matches:
>         From 351 to 371 with 0 mismatches
> Sequences output to file:
>         from 159 to 350
>                 CaGNtttttttttttttttttttttttttttttttttttttttttttttt
>                 ttttttttttttAaaaaGaaTTGtttattTACTGAACCNgggCAtAtTaG
>                 aTACACAACCCATTTTaaaTTTAcATcttttAAtTCaaTtTTGAAgTGtt
>                 TTTAcAcAcCCNCNCAAaAaaaaaaaaaTTTGGCATGcAACA
>         sequence trimmed from 5' end:
>                 ttttcccccccccnntttttttnnnnncccccnnnnnnnnnaaaaAAccC
>                 cTcNCTaTagggCGAGTTggGccCtTCTAGTNtGCATGCtTCGAGcGGcc
>                 cGccAGTgTTGATGGaTaTCTTGCaGaaTTcGcccTTaaTGAggTAACCg
>                 GTTcccAG
>         sequence trimmed from 3' end:
>                 gCTgGGAACCGTtACCtCATTAAgggCGAAtTCcAGcAcAcTGGCgGCCG
>                 TTACtAAGGGATCCGAGCTcGGNACCAAGnnnngnnnnnnnnnnnnnnnn
>                 nnttntttnntnnnnaaaaa
>
> needle asis:$SEQ[138:158] asis:$LINKERA stdout -auto
>
> asis             138 aaTGAggTAACCgGTTcccAG-    158
>                      |||||||||| ||||||||||
> asis               1 AATGAGGTAA-CGGTTCCCAGC     21
>
>
> Interestingly, in the following aligmnent, the number of mismatches is
> 6. But I did not find anything saying that gaps were disallowed in
> vectorscript ?
>
> aaTGAggTAACCgGTTcccAG
> ||||||||||| | | ||
> AATGAGGTAACGGTTCCCAGC
>
>
> I am using emboss through fink (emboss package 4.0.0-2).
>
> Have a nice day,
>
> --
> Charles Plessy
> http://charles.plessy.org
> Wako, Saitama, Japan
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>





More information about the EMBOSS mailing list