[Bioperl-l] problem while parsing UniProt(ltaxon.pm)
Chris Fields
cjfields at uiuc.edu
Thu Mar 29 12:12:43 UTC 2007
>> Here you are with the error message
>>
>> Q0QAY1_9DIPT
>> Q0QAY7_9DIPT
>> Q0QB51_9DIPT
>> Q0QB52_9DIPT
>> Q0QB62_9DIPT
>> Q0QB63_9DIPT
>>
>> ------------- EXCEPTION -------------
>> MSG: The lineage 'Eukaryota, Metazoa, Mollusca, Bivalvia,
>> Heteroconchia,
>> Veneroida, Veneroidea, Veneridae, Venerupis, Ruditapes, Venerupis'
>> had
>> two
>> non-consecutive nodes with the same name. Can't cope!
>> STACK Bio::DB::Taxonomy::list::add_lineage
>> /usr/local/ActivePerl/site/lib/Bio/DB/Taxonomy/list.pm:157
>
> Please send me the actual record that causes the exception and I'll
> see
> what I can do about fixing the problem.
Sendu,
Here's one accession which reproduces this: Q7Y720. There is an
additional component to the error that I find:
Use of uninitialized value in pattern match (m//) at /Users/cjfields/
src/bioperl-live/Bio/SeqIO/swiss.pm line 1060, <GEN0> line 13.
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: The lineage 'Eukaryota, Metazoa, Mollusca, Bivalvia,
Heteroconchia, Veneroida, Veneroidea, Veneridae, Venerupis,
Ruditapes, Venerupis' had two non-consecutive nodes with the same
name. Can't cope!
STACK: Error::throw
STACK: Bio::Root::Root::throw /Users/cjfields/src/bioperl-live/Bio/
Root/Root.pm:359
STACK: Bio::DB::Taxonomy::list::add_lineage /Users/cjfields/src/
bioperl-live/Bio/DB/Taxonomy/list.pm:157
STACK: Bio::DB::Taxonomy::list::new /Users/cjfields/src/bioperl-live/
Bio/DB/Taxonomy/list.pm:94
STACK: Bio::DB::Taxonomy::new /Users/cjfields/src/bioperl-live/Bio/DB/
Taxonomy.pm:103
STACK: Bio::Species::classification /Users/cjfields/src/bioperl-live/
Bio/Species.pm:180
STACK: Bio::SeqIO::swiss::_read_swissprot_Species /Users/cjfields/src/
bioperl-live/Bio/SeqIO/swiss.pm:1073
STACK: Bio::SeqIO::swiss::next_seq /Users/cjfields/src/bioperl-live/
Bio/SeqIO/swiss.pm:247
STACK: tax.pl:11
-----------------------------------------------------------
The problem appears to be with the OS source organism line in swiss
files, which looks like is being parsed incorrectly for these. Here
is the relevant section:
OS Venerupis (Ruditapes) philippinarum.
OG Mitochondrion.
A UniProt query limited to taxonomy using 'Venerupis' produces
several more. This only affects swissprot; embl and genbank files
with similar source lines do not have the same problem.
chris
More information about the Bioperl-l
mailing list