[Bioperl-l] Genebank mRNAs

Hilmar Lapp lapp@gnf.org
Thu, 03 May 2001 17:11:09 -0700


"Castle, John" wrote:
> 
> 
> I'm trying to read in files in GenBank format and output the various mRNA
> sequences for each GenBank record.  Can I do this with the BioPerl modules?
> (Side note - the SeqIO scripts choke on the headers contained in the NCBI
> Genbank flat files.)

Really? This would be a bug. Which script? Could you report which
entries cause problems? The genbank parser has been pretty stable
recently.

> 
> A given GenBank sequence may have several mRNAs, these "features" can be
> alternate splice forms and contain differing amounts of the total sequence;
> this is specified by the "join" lines in the Genbank records.
> 
> When I read in a Genbank file with subfeatures, I think the modules
> currently bless the subfeatures as Bio::SeqFeature::Generic objects.
> Unfortunately, when I ask for the sequence of a mRNA feature, it returns the
> total sequence, not the mRNA sequence.

Which method did you call? For getting the truncated piece referenced
by the feature location you should call $feat->seq(), not
$feat->entire_seq(). Though, I guess the implementation of the
truncation will have a problem with split locations, because I think
it uses only start() and end(), which is obviously insufficient in
such a case. So, the second bug report. :)

> 
> ****************************************************************************
> This e-mail message is the property of Rosetta Inpharmatics

Interesting. Will I be charged for reading it? Can you still make
unauthorized copies of your message?

;)

	Hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp@gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------