[Bioperl-l] needle parser in bioperl?

Fairley, Derek Derek.Fairley at bll.n-i.nhs.uk
Fri Dec 15 09:57:35 UTC 2006


Neeti,

In lieu of a response from a BioPerl guru... why not use Needle to generate your pairwise alignment in fasta format, rather than msf format? The sequence you want should correspond to a single HSP which you can get directly from the fasta alignment with Bio::SearchIO: http://www.bioperl.org/wiki/Module:Bio::SearchIO. You may not need to use Bio::AlignIO at all. 

Derek.


-----Original Message-----
From: neeti somaiya [mailto:neetisomaiya at gmail.com] 
Sent: 15 December 2006 05:22
To: Fairley, Derek; bioperl-l
Subject: Re: [Bioperl-l] needle parser in bioperl?

Hi,

Thanks a lot for your response.
I ran needle like this 
 /usr/local/bin/./needle SEQ_1.REF seq_of_contig1 -aformat msf 1.out
It gave me the output in format msf.
But now my problem is, if I use Bio::AlignIO module of Bioperl, how can I get the alignment start and stop coordinates on the sequence. I mean something like hsp->query->start which gives us the alignment start position on query sequence in a blast output when using Bio::SearchIO.
Please help.
Like I explained with an example in my previous mail, I want the coordinate where the alignment starts on the sequence.

~Neeti.
On 12/14/06, Fairley, Derek <Derek.Fairley at bll.n-i.nhs.uk> wrote:
Neeti,
 
>From http://emboss.sourceforge.net/apps/cvs/needle.html :
 
"The results can be output in one of several styles by using the command-line qualifier -aformat xxx, where 'xxx' is replaced by the name of the required format. Some of the alignment formats can cope with an unlimited number of sequences, while others are only for pairs of sequences. 
 
The available multiple alignment format names are: unknown, multiple, simple, fasta, msf, trace, srs 
 
The available pairwise alignment format names are: pair, markx0, markx1, markx2, markx3, markx10, srspair, score 
 
See: http://emboss.sf.net/docs/themes/AlignFormats.html for further information on alignment formats."
 
Not sure based on this whether you can get pairwise alignment in .msf format; can't think of a good reason why not. The BioPerl Align::IO module will allow you to parse alignments in .msf format.
 
HTH,
 
Derek.
 
-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of neeti somaiya
Sent: 14 December 2006 08:03
To: Chris Fields; bioperl-l
Subject: Re: [Bioperl-l] needle parser in bioperl?
 
How do I run needle specifying that I want the MSF format, on a linux box?
The help doesnt show me any format option. Is there anything available to
pasre MSF format?
Please find an example alignment file attached. Here the seq_of_contig
aligns with the reference sequence (i.e. SEQ_1.REF) starting at position
(coordinate) 8918 of SEQ_1.REF. I basically want this coordinate from the
output alignment, how can I parse the result to get this?
 
On 12/12/06, Chris Fields <cjfields at uiuc.edu > wrote:
>
>
> On Dec 12, 2006, at 6:14 AM, neeti somaiya wrote:
>
> > Hi,
> >
> > Does anyone know of a bioperl parser for needle output, basically I
> > won't
> > where the target sequence aligns on the template (i.e. coordinate
> > on the
> > template where the taget aligns).
> >
> > --
> > -Neeti
> > Even my blood says, B positive
>
> I answered this a number of months back:
>
> http://tinyurl.com/yzlbx5 
>
> Basically, newer versions of EMBOSS have changed the output for the
> AlignIO::emboss parser (which parses needle).  I don't believe the
> parser has been fixed to deal with that, but Jason has pointed out
> you can use MSF output when running needle, then parse using AlignIO
> with the format set to 'msf'.
>
> chris
>
 
 
 
-- 
-Neeti
Even my blood says, B positive



-- 
-Neeti
Even my blood says, B positive 




More information about the Bioperl-l mailing list