[Bioperl-l] parsing genbank (tissue_type & notes)

Hilmar Lapp lapp@gnf.org
Fri, 27 Apr 2001 10:26:16 -0700


Fernan Aguero wrote:
> 
> I decided to do that myself since I could not find how to do that on the
> bioperl docs. Is this something currently implemented in bioperl? I know
> that bioperl can parse genbank format files, but I'm not sure if it keeps
> all of the info in the file or if it just takes the sequence and some
> 'standard' features and discards the rest ...
> 

The parser aims at keeping everything. At present it might still miss
that aim by a few millimeters, but it pretty much hits it. The
remaining inconvenience for the client is that the information is not
interpreted semantically. You have to do this yourself.

So in your case you have to loop over the features attached to the
sequence, grab the one with primary_tag() eq "source", loop over the
tags of that feature and grab the tags you're interested in. So,
answering your question, no, bioperl does not discard non-standard
features, it doesn't even care what's standard and what's not. And,
yes, the bioperl genbank parser does keep all the information (except
if it fails to parse the location of a feature table entry, which is
something being worked on).

	Hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp@gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------