[Bioperl-l] bug, again VERSION without version number...

Elia Stupka elia@fugu-sg.org
Mon, 5 Aug 2002 18:07:19 +0800 (SGT)


Worked on the bug:

these records have accessions without ".version_number" in the VERSION
line (discussed a few days ago right?) which is not nice, but yet we
shouldn't be giving out a RichSeq object as the primary id and we should
still catch the GI as the primary id...

changing the parsing line from:

if( /^VERSION\s+(\S+)\.(\d+)\s*(GI:\d+)?/ ) {

to:

if( /^VERSION\s+(\S+)\.?(\d+)?\s*(GI:\d+)?/ ) {

fixes the problem by making the version number optional. Then I only
assign version number if it is present, and catch the GI in primary_id

Another thing that I don't understand and find funny is that it catches
(GI:\d+) and then does a substr($3, 3) if( $3); to get the number back. I
am sure there is some arcane reason for this but it escapes me... can't we
just catch GI:(\d+) ?

I've committed the first fix, not the second, in case there is an arcane
reason... :)

Elia

********************************
* http://www.fugu-sg.org/~elia *
* tel:    +65 6874 1467        *
* mobile: +65 9030 7613        *
* fax:    +65 6777 0402        *
********************************