[Bioperl-l] remote blast update

Chervitz, Steve Steve_Chervitz@affymetrix.com
Thu, 12 Apr 2001 16:10:59 -0700

> Hilmar Lapp wrote: 
> "Chervitz, Steve" wrote:
> > 
> > The BlastIO stream system would also have to differentiate 
> on the type of
> > Blast object you want to build (lite or heavy or whatever). 
> So perhaps you'd
> > have BlastIO/ncbi and BlastIO/ncbi_lite.
> Hi Steve, great to have you back in the discussion. As for heavy
> vs light objects, I'm not sure you really need to have multiple
> classes here. The things I see that make the difference between
> heavy and light are
> -  heavy = extended (as opposed to basics everyone needs)
> functionality in terms of what is parsed and turned into
> retrievable information. This can be heavy in terms of both
> performance and code complexity. I think one can solve it by
> making the extended parsing switchable, and separating out the
> corresponding code into their own methods. An example for what I
> mean is the HSP tiling (which I find especially useful, in fact
> essential if you want to retrieve %coverage.)
> -  heavy because of non-parsing functionality needed only by
> special client applications. This can be separated out into a
> derived class, or a consumer class (I'd prefer the latter). An
> example for what I mean is HTML table generation; as for the
> consumer class, you can have a class taking a Blast object and
> producing HTML.
> Does this make sense?

Yes. I think a parser switch could work. However, instead of performing
extra parsing, the switch could be used to determine what type of objects
are returned. Taking your HSP tiling example, this can be done lazily (i.e.,
post-parsing on an as-needed basis, as Bio::Tools::Blast::Sbjct.pm does). So
if the parser knew you were interested in this functionality, it would turn
out objects capable of providing it. If you didn't want it, you'd get
lighter-weight objects.

I like the idea of consumer classes. This would keep the result objects
simple and make it easy to plug in different consumers depending on your

> > 
> > Having a set of standard interfaces such as BlastReportI, BlastHitI,
> > BlastHSPI that define core functionalities for these 
> objects is also a great
> > goal. It would permit more code-reuse, allow users to 
> easily play with
> > different implementations, and make life easier in the CORBA world.
> > 
> I don't think these interfaces need to be specific to Blast, but I
> may be missing something. Aaron I think made a shot in this
> direction. Have you checked the (incomplete) stuff in
> Bio::Search::* ?

There are scoring and output parameter differences that aren't shared among
all search analyses. But this could be handled generically as well (e.g.,
$hit->get_score->('type') or $result->get_parameters()).

I've seen Aaron's interfaces and think they're a good start. It would be
great if your Similarity feature objects, for example, could be constructed
based on such interfaces, and not be tied to any particular type of
similarity search result object, as they are now.


> As for the BioCorba-bridged Blast-parser discussion, I think it is
> absolutely essential to have a Perl-based Blast-parser in BioPerl.
> Code-reuse between languages via a Corba bridge is a nice idea,
> but IMHO it becomes most useful not by having your text files
> parsed by other languages, but by language-independent clean and
> consistent interfacing to analysis and database engines (submit an
> object, get back the result as an object). Your mileage may vary.
> 	Hilmar
> -- 
> -----------------------------------------------------------------
> Hilmar Lapp                              email: hilmarl@yahoo.com
> GNF, San Diego, Ca. 92122                phone: +1 858 812 1757
> -----------------------------------------------------------------