[Biojava-l] adding toSequenceIterator method for Alignment

Singh, Nimesh Nimesh.Singh@maxygen.com
Mon, 29 Jul 2002 14:56:15 -0700


Given any implementation of an alignment, there is no guarantee that the alignment will have links to the original underlying sequences.  I think that the AlignmentSequenceIterator can only use public functions from the Alignment interface.  This ensures that it will work for any implementation.  We can replace the sequenceIterator implementation in individual Alignment implemenations to return the underlying Sequences.  I'll look into that.
 
The reason that the new method returns a SequenceIterator is because that is an already existing commonly used iterator.  There are other methods (usually symbolListForLabel()) already in the interface if you use SymbolLists that don't readily translate into Sequences.
 
So, I'll look at making more specific implementations for sequenceIterator that return underlying sequences.
 
Nimesh

	-----Original Message----- 
	From: Kalle Näslund [mailto:kalle.naslund@genpat.uu.se] 
	Sent: Fri 7/26/2002 11:52 AM 
	To: Singh, Nimesh 
	Cc: biojava-l@biojava.org 
	Subject: Re: [Biojava-l] adding toSequenceIterator method for Alignment
	
	

	Singh, Nimesh wrote:
	
	>     I've created a class called AlignmentSequenceIterator that I intend to put in the org.biojava.bio.seq package.  It will do the real work.  I've also added
	>        public SequenceIterator sequenceIterator() {
	>            return new AlignmentSequenceIterator(this);
	>        }
	>to each alignment class.  It should work fine in every alignment, because AlignmentSequenceIterator uses the getLabels and symbolListForLabel methods from the Alignment interface. 
	>
	>     If this is fine, then I'll upload everything later today.  If you have any suggestions for changes, then let me know.
	>
	>Nimesh
	>
	>
	
	Well, there is one big problem with this piece of code, you treat all
	objects in the alignment as being
	SymbolLists only, witch in reality isnt true, as you can insert any
	object that implements the
	SymbolList interface into an alignment.
	
	For example, i am currently populating my alignment objects with custom
	Sequence objects. if i called
	this code it would create new Sequence objects of the type
	SimpleSequence, and as i understand it from
	a quick look at the SequenceFactory code, it wont have any of the
	annotations, features etc that the
	original Sequence objects i added to the alignment had. so, instead of
	geting my custom Sequence objects
	back containing feature etc, i would get nearly "empty" SimpleSequence
	objects back, witch makes it unusable.
	Other problems should pop upp if you insert other objects into
	Alignments, say other alignments. and instead
	of getting them back as alignments when you iterate over the SymbolLists
	in the alignemnt, you get it back
	as a SimpleSequence.
	
	But, i do agree that adding a method to the Alignment interface, that
	gives you an iterator so you can
	iterate over the SymbolList's in the alignment is a good thing to add.
	
	My suggestion would just be to have it iterate over the SymbolLists that
	are inserted into the Alignment
	and avoid doing any type of alterations of the objects. That way you get
	back what you insert, and
	the method will work for everyone, just not people using SimpleSequences.
	
	regads Kalle
	
	>
	>Here is the cod for AlignmentSequenceIterator:
	>
	>public class AlignmentSequenceIterator implements SequenceIterator {
	>    private Alignment align;
	>    private Iterator labels;
	>    private SequenceFactory sf;
	>    public AlignmentSequenceIterator(Alignment align) {
	>        this.align = align;
	>        labels = align.getLabels().iterator();
	>        sf = new SimpleSequenceFactory();
	>    }
	>    public boolean hasNext() {
	>        return labels.hasNext();
	>    }
	>    public Sequence nextSequence() throws NoSuchElementException, BioException {
	>        if (!hasNext()) {
	>            throw new NoSuchElementException("No more sequences in the alignment.");
	>        }
	>        else {
	>            try {
	>                Object label = labels.next();
	>                SymbolList symList = align.symbolListForLabel(label);
	>                Sequence seq = sf.createSequence(symList, label.toString(), label.toString(), null);
	>                return seq;
	>            } catch (Exception e) {
	>         throw new BioException(e, "Could not read sequence");
	>     }
	>        }
	>    }
	>}
	>_______________________________________________
	>Biojava-l mailing list  -  Biojava-l@biojava.org
	>http://biojava.org/mailman/listinfo/biojava-l
	> 
	>