[Bioperl-l] parsing of keywords field in Bio::SeqIO::genbank

Hilmar Lapp hlapp at gnf.org
Tue Feb 11 09:39:24 EST 2003


This is a known problem brought up a couple weeks ago. What really has 
to happen is to change the respective method in RichSeqI to return an 
array, and to change the parsing code to split into an array.

While you're at it, could you submit this as bug report at 
http://bugzilla.bioperl.org?

	-hilmar

On Tuesday, February 11, 2003, at 06:32  AM, Geoff Purdy wrote:

> We've been noticing some odd behavior with the parsing
> of the 'keywords' field from a genbank flatfile in
> Bio::SeqIO::genbank.  I was unable to find any
> discussion of this in the docs or the bioperl-l list
> archives so I was hoping someone on this list could
> shed some light on the subject.
>
>> From reading the genbank release notes regarding the
> KEYWORDS field in the genbank file format and reading
> the source code which parses the KEYWORDS field in
> Bio::SeqIO::genbank, it appears that bioperl is
> discarding information in this field.  The
> specification allows for both single keywords and for
> phrases delimited by semicolons.  However, it appears
> that the parser discards the semicolons and treats the
> entire field as a single phrase.  This can cause
> problems searching this field in downstream
> applications.
>
> Is information being discarded, or am I
> misunderstanding something?  Thanks.
>
>
> Excerpt from GenBank Release Notes
> (ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt):
>
> "3.4.8 KEYWORDS Format The KEYWORDS field does not
> appear in unannotated entries, but is required in all
> annotated entries. Keywords are separated by
> semicolons; a "keyword" may be a single word or a
> phrase consisting of several words. Each line in the
> keywords field ends in a semicolon; the last line ends
> with a period. If no keywords are included in the
> entry, the KEYWORDS record contains only a period."
>
>
> Excerpt from BioPerl docs (
> http://doc.bioperl.org/releases/bioperl-1.2/Bio/SeqIO/genbank.html
> ):
>
> #Keywords
> elsif( /^KEYWORDS\s+(.*)/ ) {
>    my $keywords = $1;
>    $keywords =~ s/\;//g;
>    $keywords =~ s/\.$//; # remove possibly trailing
> dot
>    $params{'-keywords'} = $keywords;
> }
>
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Shopping - Send Flowers for Valentine's Day
> http://shopping.yahoo.com
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------



More information about the Bioperl-l mailing list