Bioperl and matcher

Peter Rice peter.rice at uk.lionbioscience.com
Tue Nov 26 16:12:46 UTC 2002


Vilanova,David,LAUSANNE,NRC/BS wrote:
>  
> Hello,
> I have problems retrieving the alignments from an emboss output.
> The program belows read 2 files and runs a matcher of all against all.
> Matcher gives me an msf output and then I try to parse this alignment with
> Bio::AlignIO.
> However I get an exception...
>  
> Processing sequence 1..vs..3...done
>  
> ------------- EXCEPTION  -------------
> MSG: 1 exists as an alignment line but not in the header. Not confident of
> what is going on!

BioPerl seems to be having trouble with the EMBOSS MSF format output. It 
could be something about the naming of the sequences?

EMBOSS is making up names for your sequences. I assume you are using 
asis::CGGCG to pass them to matcher. You can put -sid after each sequence 
to give them names, for example:

matcher -out x.x -af msf asis:ccggc -sid cg asis::cgggc -sid gg

(-sid, like -aformat, is an associated qualifier. It must follow the asis:: 
sequence because it is positional (putting it first on the command line for 
example would refer to all sequences - fine for -sformat but not a good 
idea for -sid :-)

Hope this helps

Peter

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723




More information about the EMBOSS mailing list