[Bioperl-l] post-processing of seqs

Hilmar Lapp hlapp@gnf.org
Thu, 24 Oct 2002 11:43:51 -0700


I implemented this, together with base implementation in Bio::Seq::BaseSeqProcessor that leaves a method process_seq($seq) (returns an array of any number of seqs) to be overridden by useful processing algorithms.

We'll write a couple of such processing algorithms for our local use case, but these are going to be highly specific and not very generally useful. Given this and the lack of feedback I realize that the whole thing is maybe too narrow a use case. I'm happy to take it back out if core people think it's too narrow and keep it only in our own library of specialized module implementations. One problem may be that the bioperl part of it is much more an interface than something that actually works and people can take as is and plug into something.

I'll commit it first so you can check it out if you like, and I can remove it again later if that's the consensus.

	-hilmar


> -----Original Message-----
> From: Hilmar Lapp 
> Sent: Wednesday, October 23, 2002 11:42 AM
> To: Bioperl
> Subject: [Bioperl-l] post-processing of seqs
> 
> 
> I have a use case here in which we need to subject seq objects 
> coming off a SeqIO stream to some sort of post-processing, which 
> essentially results in another stream of seq objects, where the seq 
> objects have been altered or re-created, and the number of seqs may 
> or may not be the same as in the original stream.
> 
> I know I can easily hard-code this into a script. What I want is 
> this to seamlessly integrate into pure SeqIO streams, with 
> post-processing 'algorithm' configurable through the command line 
> (yes, I'm talking about load_seqdatabase.pl).
> 
> My proposed solution would be something that resembles the Biojava 
> SeqIO event generator chain:
> 
> 	Bio::Factory::SeqProcessorI is-a Bio::Factory::SequenceStreamI
> 
> 	# gets (and possibly sets) the source stream
> 	# returns Bio::Factory::SequenceStreamI compliant object
> 	sub source_stream() {}
> 
> 	# since it is-a SequenceStreamI, having to have 
> next_seq() is implicit
> 	
> This way I could put entire processing 'algorithms' into modules 
> implementing Bio::Factory::SeqProcessorI, and chain them arbitrarily 
> and easily configurable by just enumerating the modules that I want 
> to apply.
> 
> Elia/Shawn, does biopipe do something similar already?
> 
> BTW the namespace Bio::Factory for this and SequenceStreamI is not 
> the luckiest choice I think, but that's a separate story and can be 
> solved later if others don't like it either.
> 
> 	-hilmar
> --
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>