[Bioperl-l] SimpleAlign changes
Ewan Birney
birney@ebi.ac.uk
Thu, 10 May 2001 13:00:56 +0100 (BST)
On Thu, 10 May 2001, Heikki Lehvaslaiho wrote:
>
> Dear All,
>
> Last week I decided to teach myself SimpleAlign. I found myself making
> corrections and changes into documentation as well as into code and
> writing dozens of tests. Before I commit anything, I'd like to list
> what I found.
>
> 1. While Bio::AlignIO is declared to be a prefered way of reading in
> alignments, Bio::SimpleAlign contains a number of methods doing the
> same. Should't these be removed? Is all the funcionality of these
> methods in Bio::AlignIO? [not done]
for sure
>
> 2. The method names in SimpleAlign should follow bioperl conventions
> (all lower case with words separated by undescores).
> The old names could be left but documetation should point to new
> names. [done] Should depreciated names print a warning? [not done]
>
> addSeq -> add_seq
> eachSeq -> each_seq
> eachSeqWithId -> each_seq_with_id
> length_aln -> length
> removeSeq -> remove_seq
>
probably a good thing
>
> 3. There seems to be two methods: maxname_length() and
> maxnse_lenght(),
> which are not used at all. There is a method
> maxdisplayname_lenght()
> which seems to have all the functionality needed. Can they be
> removed or pointed to maxdisplayname_length()[done]?
>
I *think* so. Are you sure nothiing in the AlignIO use them?
> 4. Methods get_displayname() and set_displayname() should be merged
> into displayname() and a test should be added to prevent using
> nonexistant sequence names. (The method manipulates hashes directly
> and new keys are automatically created when used: autovivication!)
> [done]
>
> Adding the test brough into light a problem: $seq->get_nse() is of
> form "name/start-end" and $aln->addSeq/add_seq was using form
> "name-start-end". Changed '-' -> '/' in add_seq(). [done]
>
> 5. Bio::LocatableSeq has this key method get_nse() which has a
> counterintuitive name. Couldn't we use name() while leaving the old
> name for backward compatibility. [done]
>
My view is that ->name is to ambiguous. nse stands for name-start-end
(just my view...)
> Also, get_nse() accepts arguments
> for separator characters (/ and '). This seems to be a relic
> from pre-bioperl times. Use of other characters needs be
> discouraged/ability removed? [not done]
>
Nope - we need this for some other things
> Method name/get_nse() needs internal tests to make sure that all
> three values are set. [done]
>
>
> 6. Methods affecting all sequences in the alignment
> (set_displayname_count(), set_displayname_flat(),
> set_displayname_normal(), uppercase() ) need to return true (1) so
> that they can be tested. [done]
>
> 7. Number of methods were calling depreciated PrimarySeqI methods.
> [fixed]
>
> 8. Blast output parser Bio::Toold::BPbl2seq was leaving one space
> character to the end of the sequence id resulting a nonsatandard
> LocatableSeq name. [fixed]
>
You are the man Heikki! Really good work! Wow!
>
>
> After my changes all the AlignIO test pass and SimpleAlign test count
> is up from 6 to 32. The only (non-IO) method that I did not test was
> purge().
>
> -Heikki
>
> --
> ______ _/ _/_____________________________________________________
> _/ _/ http://www.ebi.ac.uk/mutations/
> _/ _/ _/ Heikki Lehvaslaiho heikki@ebi.ac.uk
> _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
> _/ _/ _/ Wellcome Trust Genome Campus, Hinxton
> _/ _/ _/ Cambs. CB10 1SD, United Kingdom
> _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>.
-----------------------------------------------------------------