[Bioperl-l] Genbank Parsing Bug leads to discovery of strange behavior ofBio::Seq::primary_id()

Hilmar Lapp hlapp@gnf.org
Fri, 14 Jun 2002 12:34:18 -0700


primary_id() is supposed to fall back to a (supposedly unique) memory location if no primary ID was set. That's why it returns the ref $obj as a string.

The cleanest thing to do is another call for the event-based parsing: what you want is throw an exception or do saomething special if a genbank record lacks the GI number. (Genbank always keeps something to wake you up at night.)

	-hilmar

> -----Original Message-----
> From: CHALFANT_CHRIS_M@Lilly.com [mailto:CHALFANT_CHRIS_M@Lilly.com]
> Sent: Friday, June 14, 2002 11:31 AM
> To: bioperl-l@bioperl.org
> Subject: [Bioperl-l] Genbank Parsing Bug leads to discovery of strange
> behavior ofBio::Seq::primary_id()
> 
> 
> According to the documentation, this is the code for 
> Bio::Seq::primary_id()
> 
> sub primary_id {
>    my ($obj,$value) = @_;
> 
>    if( defined $value) {
>       $obj->{'primary_id'} = $value;
>     }
>    if( ! exists $obj->{'primary_id'} ) {
>        return "$obj";
>    }
>    return $obj->{'primary_id'};
> }
> 
> Is there a reason that, if $obj->{'primary_id'} does not 
> exist, Bio::Seq 
> returns "$obj"?  Wouldn't it be better if it threw an exception or 
> returned undef?
> 
> This bit me as I was using BioPerl to parse the GenPept 
> record identified 
> by the GI 2494349.  It looks as though the parser breaks and 
> does not populate the 
> primary_id field of the Bio::Seq object.  Printing out the 
> Bio::Seq with 
> Bio::SeqIO::write_seq() shows that, indeed, the GI is missing 
> from the 
> VERSION line.
> 
> Chris Chalfant
> Bioinformatics
> Eli Lilly and Company
> 317-433-3407
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>