[Bioperl-l] struggling with Bio::FeatureIO and Bio::SeqFeature::Annotated

Allen Day allenday at ucla.edu
Mon Jan 24 18:54:42 EST 2005


Marc,

The problem was that Bio::SeqIO::FTHelper was making calls assuming it had 
a Bio::SeqFeature::Generic instance.  I've updated it to make calls 
compliant with the Bio::SeqFeatureI interface, and the script below now 
at least runs using "option 1".

"option 2" will not work, at least for now, because Bio::DB::GenBank is
creating a SeqIO that holds Bio::SeqFeature::Generic objects, and these
difficult to deal with because the internal data structures are different
than a Bio::SeqFeature::Annotated.  I like the technique used below to
bridge to Bio::FeatureIO via a Bio::Tools::GFF intermediary -- very
clever.

You'll also notice that the GenBank-formatted file output by the script 
doesn't look quite right, the FEATURES section looks kind of like:

FEATURES             Location/Qualifiers
     Bio::Annotation::OntologyTerm=HASH(0xa3d93f8)1..20975
                     /source="Bio::Annotation::SimpleValue=HASH(0x9bcdbe0)"
                     /mol_type="Bio::Annotation::SimpleValue=HASH(0xa3dab1c)"
                     /seq_id="Bio::Annotation::SimpleValue=HASH(0xa214de0)"
                     /score="Bio::Annotation::SimpleValue=HASH(0xa3d92cc)"
                     /frame="Bio::Annotation::SimpleValue=HASH(0xa439b04)"
                     /chad="Bio::Annotation::Comment=HASH(0xa3da9b4)"
                     /note="score=Bio::Annotation::SimpleValue=HASH(0xa3d92cc)"
                     /note="frame=Bio::Annotation::SimpleValue=HASH(0xa439b04)"
                     /db_xref="Bio::Annotation::SimpleValue=HASH(0xa3daaf8)"
                     /clone="Bio::Annotation::SimpleValue=HASH(0xa3dab28)"
                     /strain="Bio::Annotation::SimpleValue=HASH(0xa3dabb8)"
                     /phase="Bio::Annotation::SimpleValue=HASH(0xa3d935c)"
                     /chromosome="Bio::Annotation::SimpleValue=HASH(0xa3dac00)"
                     /type="Bio::Annotation::OntologyTerm=HASH(0xa3d93f8)"
                     /organism="Bio::Annotation::SimpleValue=HASH(0xa3dac48)"

because Bio::SeqFeautre::Annotated holds annotations as objects pointers
rather than strings.  We can fix this with a stringification overload, but
I noticed that the code exists to do this in the Bio::Annotation::*
classes but is commented out, and I'm not sure why.  Maybe Hilmar can shed
some light on this.

-Allen



On Mon, 24 Jan 2005, Marc Logghe wrote:

> Hi all,
> I have some problems with Bio::FeatureIO and Bio::SeqFeature::Annotated. But maybe these modules are not designed for the things I had in mind.
> My initial goal seemed pretty straightforward. It turned out differently.
> I have a gff file containing features of bunch of bioentries sitting in BioSQL.
> I wanted to turn the gff into feature objects, add them to the bioentries, and save them back into the database.
> As a test I fetch a genbank record, strip the features and convert them to gff. The gff is again converted to features and added to the stripped seq object.
> The test script looks like this:
> ========================================================
> #!/usr/bin/perl
> use strict;
> use Bio::SeqIO;
> use Bio::Tools::GFF;
> use Bio::FeatureIO;
> use IO::String;
> use Bio::DB::GenBank;
> 
> use Data::Dumper;
> 
> *Bio::SeqFeature::Annotated::all_tags = \*Bio::SeqFeature::Annotated::get_all_tags;
> 
> my $gff;
> my $gffio = IO::String->new($gff);
> 
> my $db = Bio::DB::GenBank->new;
> my $sout = Bio::SeqIO->new(-fh => \*STDOUT, -format => 'genbank');
> my $seq = $db->get_Seq_by_acc('Z50755');
> 
> my @feat = $seq->remove_SeqFeatures;
> 
> # writing option 1
> my $fout = Bio::Tools::GFF->new(-fh => $gffio, -gff_version => 3);
> # writing option 2
> my $fout = Bio::FeatureIO->new(-fh => $gffio, -format => 'gff', -version => 3);
> 
> $fout->write_feature(@feat);
> 
> $gffio = IO::String->new($gff);
> 
> my $fin = Bio::FeatureIO->new(-fh => $gffio, -format => 'gff', -version => 3);
> 
> while (my $feat = $fin->next_feature)
> {
>  $seq->add_SeqFeature($feat);
> }
> print Data::Dumper->Dump([$seq],['seq']);
> 
> $sout->write_seq($seq);
> ========================================================
> 
> First, I had an issue when writing the features to gff using Bio::FeatureIO (writing option 2):
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: only Bio::SeqFeature::Annotated objects are writeable
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /home/marcl/src/bioperl/bioperl-live/Bio/Root/Root.pm:328
> STACK: Bio::FeatureIO::gff::write_feature /home/marcl/src/bioperl/bioperl-live/Bio/FeatureIO/gff.pm:259
> STACK: ./test.pl:25
> -----------------------------------------------------------
> 
> Therefore, I used Bio::Tools::GFF to write (writing option 1). But then, I run into troubles when it comes to dumping the sequence into genbank format:
> Can't locate object method "all_tags" via package "Bio::SeqFeature::Annotated" at /home/marcl/src/bioperl/bioperl-live/Bio/SeqIO/FTHelper.pm line 212, <GEN1> line 52.
> 
> I tried to fix this by adding the line
> *Bio::SeqFeature::Annotated::all_tags = \*Bio::SeqFeature::Annotated::get_all_tags;
>  
> But in vain:
> Can't locate object method "get_all_tags" via package "Bio::Annotation::Collection" at /home/marcl/src/bioperl/bioperl-live/Bio/SeqFeature/Annotated.pm line 547, <GEN1> line 52.
> 
> Regards,
> Marc
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 


More information about the Bioperl-l mailing list