[Bioperl-l] Use of Bio namespace
Keith James
kdj@sanger.ac.uk
10 Oct 2000 14:07:04 +0100
>>>>> "Ewan" == Ewan Birney <birney@ebi.ac.uk> writes:
Ewan> I would hope that sequence/feature stuff could be merged
Ewan> inside bioperl but there
Ewan> (a) maybe good reasons not to
Ewan> (b) you may not want to ;)
Ewan> both of which are sensible compliants.
I've been having a look at making my Sequence class SeqI compliant and
my Feature class SeqFeatureI compliant. It's been a bit tricky trying
to work out what is the best way to treat fuzzy ranges (which I've
supported) in a bioperl Seq.
Ewan> I guess - hmmmmmm - this is hard. I suspect the right thing
Ewan> to do is
Ewan> - for really different stuff, eg Ecology, it should
Ewan> get its own top-level namespace.
Ewan> - for similar stuff, people should negoiate a
Ewan> namespace that can be kept separate for their work, for
Ewan> example, I could imagine
Ewan> Bio::Expression::
Ewan> being given out to a separate expression focused
Ewan> group. Bio::TreeOfLife would be another one.
Ewan> I guess anything molecular biology orientated should
Ewan> end up inside Bio:: but by no means handled by Bioperl.
Ewan> I certainly don't want to stop anyone submitting anything to
Ewan> CPAN, so make a proposal for what you want to submit or how
Ewan> you would best like it done.
Okay. It's good to know roughly where things are going even if none of
our modules are released (if I put things under Bio:: on my or the
Pathogen Sequencing Unit's local Perl lib).
Ewan> I would also encourage you to
Ewan> - if possible, work with bioperl or criticise bioperl
Ewan> if it wasn't good enough for what you wanted to do.
It seems like bad form to criticise when I haven't contributed very
much to bioperl (if I don't like it, I should fix it...). I had a go
at hacking bioperl a while back but found my limitations (never
written a Perl module, knew nothing about OO coding) so I needed to
write some stuff from scratch to see how it all worked.
Stuff I wanted was:
Non-fussy but fairly complete EMBL parsing
Terse, but intuitive manipulation of feature qualifiers in scripts
Features with & without sequence
Clone, trim, reverse-complement sequences with all the features
attached
Fuzzy ranges (parsed from EMBL, supported in other operations)
Low memory Blast parsing
Fasta search output parsing
I'm in a better position to work on bioperl now, but still find a lot
of it hard to follow (esp. where the methods have no documentation -
this isn't just me, I know others who have been discouraged from
working on it for this reason).
As I'm sure you can appreciate, there is the time aspect to this as
well. Annotation projects need to keep to deadlines and if writing a
new module is significantly quicker than modifying an existing one,
that's the way it goes.
To be honest, these modules were not originally intended for release
(hence their cutesy and non-CPAN acceptable names). However they have
since been used in some scripts (cos we've found them easier than
bioperl) which we now need to distribute, so the issue has come up. I
would prefer to integrate at some point, if possible.
cheers,
--
-= Keith James - kdj@sanger.ac.uk - http://www.sanger.ac.uk/Users/kdj =-
The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA