[Bioperl-l] RE: Bioperl and matcher

Vilanova,David,LAUSANNE,NRC/BS david.vilanova@rdls.nestle.com
Tue, 26 Nov 2002 17:33:56 +0100


I tried that but it still doesn't fix the problem...


-----Original Message-----
From: Peter Rice [mailto:peter.rice@uk.lionbioscience.com]
Sent: mardi, 26. novembre 2002 17:13
To: Vilanova,David,LAUSANNE,NRC/BS
Cc: 'bioperl-l@bioperl.org'; 'emboss@embnet.org'
Subject: Re: Bioperl and matcher


Vilanova,David,LAUSANNE,NRC/BS wrote:
>  
> Hello,
> I have problems retrieving the alignments from an emboss output.
> The program belows read 2 files and runs a matcher of all against all.
> Matcher gives me an msf output and then I try to parse this alignment with
> Bio::AlignIO.
> However I get an exception...
>  
> Processing sequence 1..vs..3...done
>  
> ------------- EXCEPTION  -------------
> MSG: 1 exists as an alignment line but not in the header. Not confident of
> what is going on!

BioPerl seems to be having trouble with the EMBOSS MSF format output. It 
could be something about the naming of the sequences?

EMBOSS is making up names for your sequences. I assume you are using 
asis::CGGCG to pass them to matcher. You can put -sid after each sequence 
to give them names, for example:

matcher -out x.x -af msf asis:ccggc -sid cg asis::cgggc -sid gg

(-sid, like -aformat, is an associated qualifier. It must follow the asis:: 
sequence because it is positional (putting it first on the command line for 
example would refer to all sequences - fine for -sformat but not a good 
idea for -sid :-)

Hope this helps

Peter

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice@uk.lionbioscience.com +44 1223 224723