[Bioperl-l] problem with alignments and sequence locations

Chris Fields cjfields at illinois.edu
Tue Nov 10 13:58:52 UTC 2009


On Nov 10, 2009, at 6:55 AM, Steffen Heyne wrote:

> Hi,
>
> I'm using Bioperl for my research and it is very useful! Thank you!
>
> Currently I have a problem with locations tags of sequences. I read  
> in seed alignments of Rfam (in stockholm format, but I think it is  
> similar to other formats).
>
> If the location is like:
>
> AB194432.1/908-846
>
> the start/end values are changed to
>
> $seq->start = 846
> $seq->end = 908
>
> and therefore the new location (e.g.$seq->get_nse) is:
>
> AB194432.1/846-908
>
> The $seq->strand tag is correctly set to -1 in this case, but if the  
> alignment is written out again (clustal, stockholm,...) this strand  
> info is lost and the sequences have this "wrong" location. But this  
> information is important in respect to the sequence accession number.
>
> Is there a way to set the location back to the original one or is  
> this behavior desired? Any manually setting with $seq->start($val)  
> failed due to automatic checking.
>
> I'm using bioperl 1.6.1
>
> Thanks!
>
> steffen

This is a definite bug. We recently discussed amending the NSE format  
due to this (the subject came up over the last few months or so); it's  
fallen through the cracks.  Fortunaely it is very easy to fix (the  
relevant method is in LocatableSeq).

Does anyone have a problem with me adding this in?  It will change  
output for only those instances where the strand is -1, so

AB194432.1/908-846

would be start = 846, end = 908, strand = -1

AB194432.1/846-908

would be start = 846, end = 908, strand = 1

chris



More information about the Bioperl-l mailing list