[Bioperl-l] struggling with Bio::FeatureIO and
Bio::SeqFeature::Annotated
Allen Day
allenday at ucla.edu
Mon Jan 24 18:54:42 EST 2005
Marc,
The problem was that Bio::SeqIO::FTHelper was making calls assuming it had
a Bio::SeqFeature::Generic instance. I've updated it to make calls
compliant with the Bio::SeqFeatureI interface, and the script below now
at least runs using "option 1".
"option 2" will not work, at least for now, because Bio::DB::GenBank is
creating a SeqIO that holds Bio::SeqFeature::Generic objects, and these
difficult to deal with because the internal data structures are different
than a Bio::SeqFeature::Annotated. I like the technique used below to
bridge to Bio::FeatureIO via a Bio::Tools::GFF intermediary -- very
clever.
You'll also notice that the GenBank-formatted file output by the script
doesn't look quite right, the FEATURES section looks kind of like:
FEATURES Location/Qualifiers
Bio::Annotation::OntologyTerm=HASH(0xa3d93f8)1..20975
/source="Bio::Annotation::SimpleValue=HASH(0x9bcdbe0)"
/mol_type="Bio::Annotation::SimpleValue=HASH(0xa3dab1c)"
/seq_id="Bio::Annotation::SimpleValue=HASH(0xa214de0)"
/score="Bio::Annotation::SimpleValue=HASH(0xa3d92cc)"
/frame="Bio::Annotation::SimpleValue=HASH(0xa439b04)"
/chad="Bio::Annotation::Comment=HASH(0xa3da9b4)"
/note="score=Bio::Annotation::SimpleValue=HASH(0xa3d92cc)"
/note="frame=Bio::Annotation::SimpleValue=HASH(0xa439b04)"
/db_xref="Bio::Annotation::SimpleValue=HASH(0xa3daaf8)"
/clone="Bio::Annotation::SimpleValue=HASH(0xa3dab28)"
/strain="Bio::Annotation::SimpleValue=HASH(0xa3dabb8)"
/phase="Bio::Annotation::SimpleValue=HASH(0xa3d935c)"
/chromosome="Bio::Annotation::SimpleValue=HASH(0xa3dac00)"
/type="Bio::Annotation::OntologyTerm=HASH(0xa3d93f8)"
/organism="Bio::Annotation::SimpleValue=HASH(0xa3dac48)"
because Bio::SeqFeautre::Annotated holds annotations as objects pointers
rather than strings. We can fix this with a stringification overload, but
I noticed that the code exists to do this in the Bio::Annotation::*
classes but is commented out, and I'm not sure why. Maybe Hilmar can shed
some light on this.
-Allen
On Mon, 24 Jan 2005, Marc Logghe wrote:
> Hi all,
> I have some problems with Bio::FeatureIO and Bio::SeqFeature::Annotated. But maybe these modules are not designed for the things I had in mind.
> My initial goal seemed pretty straightforward. It turned out differently.
> I have a gff file containing features of bunch of bioentries sitting in BioSQL.
> I wanted to turn the gff into feature objects, add them to the bioentries, and save them back into the database.
> As a test I fetch a genbank record, strip the features and convert them to gff. The gff is again converted to features and added to the stripped seq object.
> The test script looks like this:
> ========================================================
> #!/usr/bin/perl
> use strict;
> use Bio::SeqIO;
> use Bio::Tools::GFF;
> use Bio::FeatureIO;
> use IO::String;
> use Bio::DB::GenBank;
>
> use Data::Dumper;
>
> *Bio::SeqFeature::Annotated::all_tags = \*Bio::SeqFeature::Annotated::get_all_tags;
>
> my $gff;
> my $gffio = IO::String->new($gff);
>
> my $db = Bio::DB::GenBank->new;
> my $sout = Bio::SeqIO->new(-fh => \*STDOUT, -format => 'genbank');
> my $seq = $db->get_Seq_by_acc('Z50755');
>
> my @feat = $seq->remove_SeqFeatures;
>
> # writing option 1
> my $fout = Bio::Tools::GFF->new(-fh => $gffio, -gff_version => 3);
> # writing option 2
> my $fout = Bio::FeatureIO->new(-fh => $gffio, -format => 'gff', -version => 3);
>
> $fout->write_feature(@feat);
>
> $gffio = IO::String->new($gff);
>
> my $fin = Bio::FeatureIO->new(-fh => $gffio, -format => 'gff', -version => 3);
>
> while (my $feat = $fin->next_feature)
> {
> $seq->add_SeqFeature($feat);
> }
> print Data::Dumper->Dump([$seq],['seq']);
>
> $sout->write_seq($seq);
> ========================================================
>
> First, I had an issue when writing the features to gff using Bio::FeatureIO (writing option 2):
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: only Bio::SeqFeature::Annotated objects are writeable
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /home/marcl/src/bioperl/bioperl-live/Bio/Root/Root.pm:328
> STACK: Bio::FeatureIO::gff::write_feature /home/marcl/src/bioperl/bioperl-live/Bio/FeatureIO/gff.pm:259
> STACK: ./test.pl:25
> -----------------------------------------------------------
>
> Therefore, I used Bio::Tools::GFF to write (writing option 1). But then, I run into troubles when it comes to dumping the sequence into genbank format:
> Can't locate object method "all_tags" via package "Bio::SeqFeature::Annotated" at /home/marcl/src/bioperl/bioperl-live/Bio/SeqIO/FTHelper.pm line 212, <GEN1> line 52.
>
> I tried to fix this by adding the line
> *Bio::SeqFeature::Annotated::all_tags = \*Bio::SeqFeature::Annotated::get_all_tags;
>
> But in vain:
> Can't locate object method "get_all_tags" via package "Bio::Annotation::Collection" at /home/marcl/src/bioperl/bioperl-live/Bio/SeqFeature/Annotated.pm line 547, <GEN1> line 52.
>
> Regards,
> Marc
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list