[Bioperl-l] $seq->primary_id for swissprot entries

Hilmar Lapp hlapp@gnf.org
Fri, 4 Oct 2002 15:56:16 -0700


So far the parser divided the ID at the underscore into primary_id 
and division. E.g., 143E_HUMAN became $seq->primary_id("143E") (and 
$seq->display_id("143E_HUMAN")).

This behaviour violated the documentation of primary_id: it's value 
needs to be _very_ unique, not even somewhat. The first part of the 
two-part swissprot ID is not even unique in one swissprot release. 
(I noticed this when I got a UK failure in biosql.)

So I changed the parser to leave $seq->primary_id() untouched, if 
the ID matches the XXX_XXX pattern. If it doesn't, it sets 
primary_id() to the same value as ID (and, hence, display_id()), 
which is not particularly helpful, but at least unchanged to before. 
Suggestions for how to do this better are welcome.

Tests pass. Obviously, I had to disable (skip) some tests, because 
some do test primary_id for swissprot input.

	-hilmar

--
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------