[Bioperl-l] load_seqdatabase error with a specific locus from genbank

Chris Fields cjfields at illinois.edu
Tue Apr 7 03:59:14 UTC 2009


On Apr 6, 2009, at 8:05 PM, Torsten Seemann wrote:

>> The full record is here: http://www.ncbi.nlm.nih.gov/nuccore/544772
>
>  gene            order(S67862.1:72..75,join(S67863.1:1..788,1..19))
>
>> Does anyone see why the location parser should have a problem with  
>> the first
>> gene feature? It's nested, and has remote location components, but  
>> at first
>> sight nothing jumps out at me as extraordinary. Has someone  
>> recently changed
>> the location parsing code? If no-one has an immediate idea what  
>> could be at
>> work here, this needs investigating.

The location parsing code was refactored above 3-4 years ago w/o  
problems.  This'll be the first one to crop up.  I'll try taking a  
look at it.

> I'm not sure if Bioperl handles the order() operator?
>
> For those unfamilair with the order() operator:
>
> http://www.ncbi.nlm.nih.gov/collab/FT/#3.5.2
>
> order(location,location, ... location)
> The elements can be found in the specified order (5' to 3' direction),
> but nothing is implied about the reasonableness about joining them.
>
>
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash
> University, AUSTRALIA

It's interesting that the version from eutils differs significantly in  
the feature table when retrieving 'gb' or 'gbwithparts', the latter  
resolves the location (see below).  Regardless we'll need to make sure  
this is parseable.

....

FEATURES             Location/Qualifiers
      source          1..77
                      /organism="Ovine respiratory syncytial virus"
                      /mol_type="genomic RNA"
                      /db_xref="taxon:28869"
      gene            order(S67862.1:72..75,join(S67863.1:1..788,1..19))
                      /gene="G"
      gene            55..>77
                      /gene="fusion glycoprotein F"



chris



More information about the Bioperl-l mailing list