[Bioperl-l] problem while parsing UniProt(ltaxon.pm)
Chris Fields
cjfields at uiuc.edu
Thu Mar 29 14:18:42 UTC 2007
On Mar 29, 2007, at 8:41 AM, Sendu Bala wrote:
> Chris Fields wrote:
>> Here's one accession which reproduces this: Q7Y720. There is an
>> additional component to the error that I find:
>> Use of uninitialized value in pattern match (m//) at /Users/
>> cjfields/src/bioperl-live/Bio/SeqIO/swiss.pm line 1060, <GEN0>
>> line 13.
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: The lineage 'Eukaryota, Metazoa, Mollusca, Bivalvia,
>> Heteroconchia, Veneroida, Veneroidea, Veneridae, Venerupis,
>> Ruditapes, Venerupis' had two non-consecutive nodes with the same
>> name. Can't cope!
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /Users/cjfields/src/bioperl-live/Bio/
>> Root/Root.pm:359
>> STACK: Bio::DB::Taxonomy::list::add_lineage /Users/cjfields/src/
>> bioperl-live/Bio/DB/Taxonomy/list.pm:157
>> STACK: Bio::DB::Taxonomy::list::new /Users/cjfields/src/bioperl-
>> live/Bio/DB/Taxonomy/list.pm:94
>> STACK: Bio::DB::Taxonomy::new /Users/cjfields/src/bioperl-live/Bio/
>> DB/Taxonomy.pm:103
>> STACK: Bio::Species::classification /Users/cjfields/src/bioperl-
>> live/Bio/Species.pm:180
>> STACK: Bio::SeqIO::swiss::_read_swissprot_Species /Users/cjfields/
>> src/bioperl-live/Bio/SeqIO/swiss.pm:1073
>> STACK: Bio::SeqIO::swiss::next_seq /Users/cjfields/src/bioperl-
>> live/Bio/SeqIO/swiss.pm:247
>> STACK: tax.pl:11
>> -----------------------------------------------------------
>> The problem appears to be with the OS source organism line in
>> swiss files, which looks like is being parsed incorrectly for
>> these. Here is the relevant section:
>> OS Venerupis (Ruditapes) philippinarum.
>> OG Mitochondrion.
>> A UniProt query limited to taxonomy using 'Venerupis' produces
>> several more. This only affects swissprot; embl and genbank files
>> with similar source lines do not have the same problem.
>
> Thanks. I've made a tentative fix to swiss.pm. The only problem
> might be common names/ descriptions don't get caught on some
> strange OS lines. I don't have enough experience of OS lines to
> know what they might look like.
>
> Still, at least there won't be thrown exceptions, which some users
> may prefer ;)
>
> I'll add tests later if and when Ambrose/ yourself confirm all is
> well.
I'm getting it to parse but there is a '.' appended to the
scientific_name():
Venerupis (Ruditapes) philippinarum.
which appears in the classification:
Venerupis (Ruditapes) philippinarum.; Ruditapes; Venerupis;
Veneridae; Veneroidea; Veneroida; Heteroconchia; Bivalvia; Mollusca;
Metazoa; Eukaryota;
chris
More information about the Bioperl-l
mailing list