[Bioperl-l] 'virtual' seqs
Lincoln Stein
lstein@cshl.org
Mon, 8 Jul 2002 13:57:36 -0400
OK, we register a vote to do it both ways, with the default to return undef.
I'm usually pro-choice, but I imagine that having the option leaves us open to
problems arising when people try to link together two libraries in which the
respective authors made different choices.
Lincoln
On Monday 01 July 2002 03:40 pm, Charles Tilford wrote:
> I'll weigh in on this, since this is an issue that comes up with BSML
> documents (which can happily have annotations with no sequence).
>
> For the issue of returning undef vs. a string of characters, why not
> parameterize the behavior during creation of the object? That is:
>
> my $vs = Bio::Seq::VirtualSequence->new( -len => 5000, -pad => "X");
>
> ...would generate a seq string corresponging to ("X" x 5000). If not
> defined, then seq() would be left as undef. This has the advantage of
> allowing the user to specify another character (such as "N", or even
> "." or "-") as the placeholder character. Disadvantage when the user
> sets something REALLY odd, like "7" or "fruit fly", and that causes
> complaints later. At some point I guess you have to trust the user to
> be moderately kind to the API.
>
> I have no feelings about the name of such an object...
>
> -Charles
>
> Lincoln Stein wrote:
> > I think it's important that we be able to perform manipulations on
> > feature tables and annotations even when the underlying sequence is
> > completely unavailable (not even a guarantee that you can fetch the
> > sequence if you wait long enough). Laziness is a great feature, but it's
> > more of an
> > implementation issue than something that should be exposed to the API.
> >
> > As Ewan suggests, it's probably better to return a string of N's rather
> > than an undef sequence; otherwise lots of programs will break. However I
> > think that EmptySequence has the wrong connotation. I prefer
> > VirtualSequence, or possibly UnknownSequence.
> >
> > Lincoln
> >
> > On Wednesday 26 June 2002 06:59 pm, Hilmar Lapp wrote:
> > > I like LazySeq best -- it means the absence of the sequence is not
> > > written in stone, fetching is just expensive and can take a while.
> > >
> > > Also, one should be able to write these sequences to transport the
> > > annotation and feature table (without potentially expensive sequence
> > > transport, too). In this case a parser's write_seq() method asking the
> > > object for the sequence should get an empty string instead of
> > > triggering the actual sequence fetch. At least as an option.
> > >
> > > I'm wondering how this should be implemented ... not sure what's the
> > > right thing to do.
> > >
> > > -hilmar
> > >
> > > > -----Original Message-----
> > > > From: Jason Stajich [mailto:jason@cgt.mc.duke.edu]
> > > > Sent: Thursday, June 20, 2002 11:57 AM
> > > > To: Bioperl
> > > > Subject: [Bioperl-l] 'virtual' seqs
> > > >
> > > >
> > > > We are processing datafiles - bsml,game, (agave?) documents
> > > > where it is
> > > > possible to just know the length of the sequence but not have
> > > > any actual
> > > > sequence data associated. I think we should have sequence
> > > > objects which
> > > > can handle this - they would have a length, but seq() would warn and
> > > > return undef. We need one that would implement Bio::Seq::RichSeqI
> > > > interface - call it VirtualRichSeq ? Perhaps we'll need the
> > > > equivalent PrimaryVirtualSeq and VirtualSeq?
> > > >
> > > > Can someone think of a better name, I don't want to confuse
> > > > with Ensembl
> > > > VirtualXX objects? This would be implemeted in Bio::Seq:: namespace.
> > > >
> > > > -jason
> > > >
> > > > --
> > > > Jason Stajich
> > > > Duke University
> > > > jason at cgt.mc.duke.edu
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l@bioperl.org
> > > > http://bioperl.org/mailman/listinfo/bioperl-l
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@bioperl.org
> > > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> > --
> > ========================================================================
> > Lincoln D. Stein Cold Spring Harbor Laboratory
> > lstein@cshl.org Cold Spring Harbor, NY
> > ========================================================================
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
--
========================================================================
Lincoln D. Stein Cold Spring Harbor Laboratory
lstein@cshl.org Cold Spring Harbor, NY
========================================================================