[Bioperl-l] $seq->primary_id for swissprot entries
Hilmar Lapp
hlapp@gnf.org
Fri, 4 Oct 2002 15:56:16 -0700
So far the parser divided the ID at the underscore into primary_id
and division. E.g., 143E_HUMAN became $seq->primary_id("143E") (and
$seq->display_id("143E_HUMAN")).
This behaviour violated the documentation of primary_id: it's value
needs to be _very_ unique, not even somewhat. The first part of the
two-part swissprot ID is not even unique in one swissprot release.
(I noticed this when I got a UK failure in biosql.)
So I changed the parser to leave $seq->primary_id() untouched, if
the ID matches the XXX_XXX pattern. If it doesn't, it sets
primary_id() to the same value as ID (and, hence, display_id()),
which is not particularly helpful, but at least unchanged to before.
Suggestions for how to do this better are welcome.
Tests pass. Obviously, I had to disable (skip) some tests, because
some do test primary_id for swissprot input.
-hilmar
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------