[Bioperl-l] TFBS databases, Bio::Matrix::PSM suitable?

Chris Fields cjfields at uiuc.edu
Tue Aug 22 21:34:26 UTC 2006


...

> Well, we are talking not just about the format but also about the data,
> aren't
> we? None of the examples you mentioned are comparable to the transfac case
> as
> these are not databases. This is my personal opinion on this matter- I
> think
> that the data provider should be at least partially involved in the
> process.

The SeqIO examples are comparable from the standpoint of being proprietary.
Technically, having data publicly available in a specific format (lasergene
or strider) and having something capable of converting to and from that
format (SeqIO) could constitute a violation of whatever licensing agreement
there is between the software vendor and the licensee.  You might be
surprised what constitutes such a violation in the States, all depending on
the 'Terms of Agreement.'  The previous two examples may not have those
restrictions.  However, I must note that you still cannot convert to
Lasergene format, only from it to something else (it doesn't implement
write_seq()).

They are strictly used as examples, though.  So don't take them too
literally.  

I do agree, however, that we must abide by their rules.  And it is proper to
ask for their involvement, or at least their permission. 

...

> 
> There was a push to remove some of the code that is not mainained a year
> or so
> ago... And that is what I am saying too- I am just concerned about how
> many
> people with access to the data would be contributing....

I believe the reason much of the code wasn't removed was to retain
functionality as completely removing them would cause too many problems.
The idea was to eventually deprecating those modules that weren't being
maintained or had other, better implementations.  

Some of that older code is currently earmarked for deprecation
(Bio::Species, Bio::Taxonomy, etc).  There are a few others (Bio::Assembly)
which could use a bit more care, but they still work and are still in use
(and have been patched recently with new code).

...

> I think you missunderstood me here- I mean that Sendu is making a fairly
> strong statement... When transfac stopped providing the data files I
> announced
> that I am not providing further maintenance and said that it should be
> probably removed- this is my opinion in general and has nothing to do with
> Sendu or any other developer. I think this discussion becomes a bit
> religious
> in nature and there is a little point in continuing.
> I think Sendu is taking the right approach here by contacting the provider
> and
> if Biobase gets involved (I think they should be interested in doing so)
> then
> concerns I expressed will not be valid.

I think you both made pretty strong statements.  No problem with trying to
defend your position as long as you play nice ;>

My opinion: the idea of TRANSFAC not providing data files seems unreasonable
considering the data is publicly available.  I think you might agree with
that statement to some degree based on your decision to stop maintaining the
TRANSFAC modules and your wish to have them removed.  But, the reality is,
refusing to publicly release that data is their right.  They did go through
the trouble to collect all of this into one database and have it constantly
maintained.  

I must point out there is nothing stopping others from producing an
'open-source' variation of the TRANSFAC database built from the ground up
(independently from the TRANSFAC .dat files).  The fact that it hasn't
happened yet surprises me.

However, none of that should prevent having a parser and a DB module capable
of accessing the current TRANSFAC data, especially when someone is willing
to maintain them and as long as they take the right approach, as Sendu is
doing.
 
Chris





More information about the Bioperl-l mailing list