[Bioperl-l] TFBS databases, Bio::Matrix::PSM suitable?
Sendu Bala
bix at sendu.me.uk
Tue Aug 22 13:14:34 UTC 2006
Chris Fields wrote:
> Many Bio::DB* modules access the database to get the raw data, and this
> is attached to an Bio::*IO stream class in some way (for most cases).
> There are a few that get around this; for instance, Bio::DB::Taxonomy*
> uses no specialized SeqIO-like class.
Yes, Taxonomy being what I'm familiar with I was thinking of doing it
the same way, especially given that there are so many completely
different kinds of information you would want to get out of a TFBS
database. I'll look into how it is 'normally' done if anyone suggests
that would be better.
> Like you mentioned, you could extend Bio::Matrix::PSM::IO::transfac
> specifically to encompass the 'instance' sequences (the other PSM::IO
> modules wouldn't have the same methods available to them), use
> SimpleAlign or SeqFeature::SimilarityPair (I agree the former is
> probably better).
It's better because we're talking about a multiple alignment almost
always with more than 2 sequences, so SimilarityPair would not be
appropriate...
> Or have the Bio::DB module set up to grab either your
> 'instance' sequences by ID (where you could possibly implement
> RandomAccessI)
... though having said that you'd still want access to the individual
sequences by ID.
> Does the TFBS package have any overlap here? I haven't used them (they
> require PDL which is a pain to install on WinXP) but they are supposed
> to be fully integrated with Bioperl.
TFBS::DB::Local_TRANSFAC parses only the pure matrix information; even
Bio::Matrix::PSM::IO::transfac parses out more of the information and
makes it available in a useful way.
Transfac is far more complicated, interesting and useful than just the
matrix.dat file though.
More information about the Bioperl-l
mailing list