[Bioperl-l] Tag handling on SeqFeature::Generic

Hilmar Lapp hlapp@gnf.org
Wed, 28 Aug 2002 14:32:22 -0700


It depends on how many objects you're going to instantiate. If it's a thousand, don't worry. If it's 10 million and you can avoid it by some tricks, it's probably worth doing so. Between the extremes, the question is how many seconds or minutes you want to squeeze out of the total running time.

I'd worry about performance penalty because of too many objects being created once performance becomes an issue ... Many times scripts have other bottlenecks with a higher impact, but in the end it really depends.

	-hilmar

> -----Original Message-----
> From: Marco Aurelio Valtas Cunha [mailto:mavcunha@gordon.fmrp.usp.br]
> Sent: Wednesday, August 28, 2002 12:13 PM
> To: Lincoln Stein
> Cc: Bioperl
> Subject: Re: [Bioperl-l] Tag handling on SeqFeature::Generic
> 
> 
> Hi Lincoln,
> 
> Yes I'm doing this now, and I think, I won a couple of 
> runtime seconds. 
> I have the script waiting for moderator approval, if it gets 
> in then you 
> can see what I'm doing now.
> 
> This confusion about how to use SeqFeature comes from a 
> concept that I 
> have in mind that instanciate an object is always a heavy 
> task. so I try 
> to avoid it, but I really don't know if this a real concern that I 
> should have.
> 
> Marco.
> 
> 
> 
> Lincoln Stein wrote:
> > Hi Marco,
> > 
> > Probably better to create one Bio::SeqFeature::Generic each 
> time, and then to 
> > call its gff_string() method:
> > 
> > $f = Bio::SeqFeature::Generic->new( -seqname=>'Chr10',
> > 						-start => 10, 
> > 						-end => 100,
> > 						-score=>1000,
> > 						
> -tag=>{Target=>'Sequence:fred 1 1000'});
> > print $f->gff_string;
> > 
> > Lincoln
> > 
> > On Wednesday 28 August 2002 10:56 am, Marco Aurelio Valtas 
> Cunha wrote:
> > 
> >>Jason,
> >>
> >>Reusing the SeqFeature was was the mistake now I'll create a new
> >>SeqFeature each time.
> >>
> >>Thanks
> >>Marco.
> >>
> >>Jason Stajich wrote:
> >>
> >>>On Wed, 28 Aug 2002, Marco Aurelio Valtas Cunha wrote:
> >>>
> >>>>Hi Jason,
> >>>>
> >>>>I'm not intended to violate any API. The reason for such 
> approach is
> >>>>generate GFF file for Lincoln's gbrowse (www.gmod.org) 
> mapping tool.
> >>>>Gbrowse need something like (GFF):
> >>>>
> >>>>Contig1 blastn similarity 10 40 . +  . Target 
> "Sequence:Contig3" 1 500
> >>>>
> >>>>Where Target "Sequence:$mysequence" is the tag that 
> changes for each
> >>>>mapped sequence in my case. What happened is that after a 
> add_tag_value
> >>>>and go to the next sequence and called again 
> add_tag_value creates two
> >>>>tags and so on.
> >>>
> >>>But aren't you creating a new SeqFeature::Generic each 
> time?  I don't
> >>>understand why you'd be reusing the same one over and over?
> >>>
> >>>
> >>>>I felling that using SeqFeature::Generic is wrong. But I'm really
> >>>>confused with Tools:GFF and SeqFeature::Generic in both 
> modules you can
> >>>
> >>>To clarify:
> >>>
> >>>Bio::Tools::GFF is probably badly named -- it is just and 
> Input/Output
> >>>mechanism for Bio::SeqFeatureI objects and GFF format. You 
> can think of
> >>>it as analagous to Bio::SeqIO which reads/writes sequence formats,
> >>>Bio::Tools::GFF reads/writes feature formats (which don't have any
> >>>sequence).  We could theoretically write a Bio::SeqIO::gff 
> which created
> >>>a sequence  which is empty but had a set of features 
> attatched to it.
> >>>
> >>>
> >>>Bio::SeqFeature::Generic is the holder for the sequence 
> feature data.
> >>>
> >>>
> >>>>manipulate a GFF and I couldn't choose between. This is sometimes
> >>>>frustrating I know I can do using Bioperl but I can't 
> figure out how to
> >>>>do it.
> >>>>
> >>>>
> >>>>Marco.
> >>>>
> >>>>Jason Stajich wrote:
> >>>>
> >>>>>It is a little more flexible to do:
> >>>>>
> >>>>>use Bio::Tools::GFF;
> >>>>>my $io = new Bio::Tools::GFF(-file => '>filename');
> >>>>>     # defaults to STDOUT if no -file or -fh is provided
> >>>>>$io->write_feature($feat);
> >>>>>
> >>>>>
> >>>>>No way to change a tag though without violating the API 
> (going under
> >>>>>the hood -- which you CAN do, we just don't recommend 
> it) - are you sure
> >>>>>that is what you want to do?  Seems strange to be 
> updating a tag's value
> >>>>>constantly but you must have a specific reason?
> >>>>>
> >>>>>-jason
> >>>>>
> >>>>>On Tue, 27 Aug 2002, Marco Aurelio Valtas Cunha wrote:
> >>>>>
> >>>>>>Hi Bioperl,
> >>>>>>
> >>>>>>I don't think this is the better way to do it, but...
> >>>>>>I'm using SeqFeature::Generic to create a GFF2 output, 
> some like this:
> >>>>>>
> >>>>>>#!/usr/bin/perl
> >>>>>>
> >>>>>>my $gff_io = new Bio::SeqFeature::Generic();
> >>>>>>
> >>>>>>--cut--
> >>>>>>
> >>>>>># Some loop ...
> >>>>>>
> >>>>>>$gff_io->add_tag_value("Target","Sequence:$query");
> >>>>>>
> >>>>>>      print $gff_io->gff_string();
> >>>>>>
> >>>>>>$gff_io->remove_tag("Target");
> >>>>>>
> >>>>>>#end of the loop;
> >>>>>>
> >>>>>>
> >>>>>>The issue is that I always have to add_tag_value() and then
> >>>>>>remove_tag(), cause AFAIK there's no way to change the 
> tag value once
> >>>>>>is created, am I right? or there's a better way to do this?
> >>>>>>
> >>>>>>Thank you,
> >>>>>>Marco Valtas.
> >>>>>
> > 
> > 
> 
> 
> -- 
> ##############################################################
> # Atenção meu email mudou para  mavcunha@bit.fmrp.usp.br     #
> # Veja porque http://scarecrow.fmrp.usp.br/~mavcunha/public  #
> # Attention my email changed to mavcunha@bit.fmrp.usp.br     #
> # See why here http://scarecrow.fmrp.usp.br/~mavcunha/public #
> ##############################################################
> Marco Aurélio Valtas Cunha
> Laboratório de Bioinformática
> Hemocentro de Ribeirão Preto
> Faculdade de Medicina de Ribeirão Preto
> Universidade de São Paulo
> Tel 55 16 3963-9300 R: 9603
> http://bit.fmrp.usp.br
> http://scarecrow.fmrp.usp.br/~mavcunha/public/
> email: mavcunha@bit.fmrp.usp.br
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>