[Bioperl-l] <no subject>
Lincoln Stein
lstein at cshl.edu
Mon Mar 6 16:31:47 UTC 2006
Hi,
Since I wrote the last message I have done some more testing and have
determined that the flybase GFF3 files cannot be stored in Bio::DB::GFF due
to limitations in the Bio::DB::GFF data model. The issue is that Bio::DB::GFF
can only store one level of parentage, and not the two levels needed by
flybase genes.
Here is a quick fix to preprocess the gff3 files so that they can be used by
Bio::DB::GFF:
while (<>) {
my @fields = split "\t";
next unless $fields[2] eq 'mRNA';
s/Parent=([^;]+)/Gene=$1/;
} continue {
print;
}
This turns the "Parent" field of mRNA lines into a "Gene" attribute. You can
then find all transcripts corresponding to a particular gene in much the way
you tried earlier:
my $tcs = $tg->features(-types =>'processed_transcript',
-attributes => {Gene=> $gene},
-iterator => 1);
I am going back to work on Bio::DB::GFF3, which will fix this problem.
Lincoln
On Monday 06 March 2006 00:02, Marco Blanchette wrote:
> Dear all--
>
> I am trying to forge my first bioperl weapons with the
> Bio::DB::GFF and Bio::Graphics modules. My goal is to display genes with
> their underlying mRNAs and later on add addition useful info (ie binding
> site for our preferred proteins).
>
> I loaded the GadFly gff3 annotation in a mysql database using
> bulk_load_gff.pl and I am trying to pass a Bio::SeqFeatureI to the
> Bio::Graphics::add_feature method.
>
> My understanding is that:
> my $tcs = $tg->features(-types =>'processed_transcript',
> -attributes => {Parent => $gene},
> -iterator => 1);
>
> Produces a Bio::SeqIO object that can be iterate through the next_seq
> method to get a Bio::Seq object that could be used to extract a
> Bio::SeqFeatureI by using the get_SeqFeatures method.
>
> Somehow, my script does not produce the expected results. Could somebody
> put me on back on the right track.
>
>
> #!/usr/bin/perl
> use strict;
> use warnings;
> use Bio::DB::GFF;
> use Bio::Graphics;
>
> my $dmdb = Bio::DB::GFF->new( -adaptor => 'dbi::mysql',
> -dsn => "chr4",
> );
>
>
> my @genes = ('CG2041'); ##a gene on the fourth chromosome
>
> foreach my $gene (@genes){
>
> my $geneseg = $dmdb->segment(-name => $gene, -merge);
>
> if ($geneseg){
>
> my @tgs = $geneseg->features(-types => 'gene');
>
> for my $tg (@tgs){
>
> my $length = $tg->length();
>
> my $panel = Bio::Graphics::Panel->new(-length => $length, -width
> => 800);
>
> my $track = $panel->add_track( -glyph => 'generic',
> -label => 1);
>
> my $tcs = $tg->features(-types =>'processed_transcript',
> -attributes => {Parent => $gene},
> -iterator => 1);
>
> while ( my $tc = $tcs->next_seq ){
> $track->add_feature($tc->get_SeqFeatures);
> }
>
> print $panel->png;
> }
> }
> }
>
> Many thanks
>
>
> Marco Blanchette, Ph.D.
>
> mblanche at berkeley.edu
>
> Donald C. Rio's lab
> Department of Molecular and Cell Biology
> 16 Barker Hall
> University of California
> Berkeley, CA 94720-3204
>
> Tel: (510) 642-1084
> Cell: (510) 847-0996
> Fax: (510) 642-6062
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
--
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu
More information about the Bioperl-l
mailing list