[Bioperl-l] SimpleAlign - get_seq_by_id

Chris Fields cjfields at illinois.edu
Sat Oct 25 17:32:54 UTC 2008


Issues with naming each_* vs get_* vs next_* have been raised in the  
past, though I can't find them on the mail list archives.  I have  
similar issues with num_* vs no_*.

Maybe we should come up with some basic coding guidelines in a HOWTO  
(tests, method names, etc).  We already have some basic documentation  
for coding standards with some suggestions (best practices, advanced  
bioperl, etc), so maybe these should be consolidated into a single  
resource and revised by the core devs to reflect what we expect.

BTW, I think an API cleanup is worth doing, but I don't see it getting  
done before a 1.6 release until we agree on some simple coding  
conventions.  However, for SimpleAlign, we could run a simple cleanup  
by moving non-AlignI (utility) methods to the Bio::Align::Utilities  
module and deprecating use of the Bio::SimpleAlign versions (i.e. warn  
and delegate to the Utilities versions in the meantime, then remove  
after 1.6).

chris

On Oct 25, 2008, at 10:47 AM, Jason Stajich wrote:

> you're right - it should be rolled back. I guess each_seq_xxx gets  
> the job done.
>
> We have a real problem with each vs get for our API mixture. I think  
> there had been some logic there at one point but I think it is  
> confusingly mixed now.
> Perhaps a cleaned up API with deprecated aliases would be okay way  
> to at some point move towards more standardized.
>
> It would make sense to also see about implementing Gblocks style  
> filtering method as well (but not in SimpleAlign give then number of  
> methods already as you mention!).
>
> -jason
> On Oct 24, 2008, at 3:05 AM, Heikki Lehvaslaiho wrote:
>
>> Spoke too soon: each_seq_with_id() already exists. Is there really  
>> a need for
>> get_seq_by_id()?
>>
>> A more general observation: Bio::SimpleAlign with its 83 methods  
>> has grown too
>> big to keep all the code (3055 lines total) in one file. Any  
>> volunteers to
>> break it up into more manageable chunks?
>>
>> The methods in the current file have already been categorised which  
>> should help
>> in the task:
>>
>> =head1 Modifier methods
>> =head1 Sequence selection methods
>> =head1 Create new alignments
>> =head1 Change sequences within the MSA
>> =head1 MSA attributes
>> =head1 Alignment descriptors
>> =head1 Alignment positions
>> =head1 Sequence names
>>
>> The helper modules should go into Bio::Align name space.
>>
>>
>>  -Heikki
>>
>>
>> On Friday 24 October 2008 08:32:49 Heikki Lehvaslaiho wrote:
>>> The main reason it has not been Bio::SeqAlign is that sequence ID  
>>> not
>>> necessarily a unique identifier in a MSA. Multiple regions of the  
>>> sequence
>>> defined by one ID can be in one.
>>>
>>> The current code returns only the more or less randomly selected  
>>> first
>>> Bio::LocatebleSeqI object with that ID. Should we make it context  
>>> sensitive
>>> and return an array of sequences in array context?
>>>
>>> That brings up an other question: After the change, the  
>>> get_seq_by_id()
>>> will behave differently from all other instances of that method,  
>>> so should
>>> it be renamed to reflect that?
>>>
>>>     -Heikkki
>>>
>>> On Thursday 23 October 2008 21:29:20 Jason Stajich wrote:
>>>> I added get_seq_by_id to Bio::SimpleAlign to allow retrieval of a
>>>> particular sequence from the alignment by ID. Not sure why this  
>>>> didn't
>>>> exist before.
>>>>
>>>> -jason
>>>> --
>>>> Jason Stajich
>>>> jason at bioperl.org
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ______ _/      _/ 
>> _____________________________________________________
>>     _/      _/
>>    _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>   _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>  _/  _/  _/  SANBI, South African National Bioinformatics Institute
>> _/  _/  _/  University of Western Cape, South Africa
>>    _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>> ___ _/_/_/_/_/ 
>> ________________________________________________________
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Jason Stajich
> jason at bioperl.org
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Marie-Claude Hofmann
College of Veterinary Medicine
University of Illinois Urbana-Champaign







More information about the Bioperl-l mailing list