[Bioperl-l] printing UnivAlgn

Murad Nayal murad@godel.bioc.columbia.edu
Sat, 16 Dec 2000 00:03:22 +0100


This is a multi-part message in MIME format.
--------------488DFB5ACD9A6682D9C2A501
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit



Hi Peter, 

Ok, so implementing all of SimpleAlign interface in UnivAln is not the
most straightforward thing in the world. for one the internal
representation of sequences in the two are very different. nonetheless
you only use three functions in AlignIO to output the alignment (at
least in AlignIO::clustalw and a couple of other classes). I implemented
these functions in UnivAln (in terms of UnivAln interface) and it seems
to allow AlignIO to print out UnivAln as you would expect. While I was
at it I implemented a function to get a SimpleAlign from UnivAln. these
new functions, getSimpleAlign() and eachSeq(), are inefficient, they
create brand new LocatableSeqs every time they're called. but to augment
UnivAln and have it maintain a permanent set of LocatableSeqs needs some
substantial effort to ensure consistency between these sequences and the
UnivAln->{seq} array, which is too much work for tonight! :-)

the diffs are attached.

By the way, I found it useful to modify AlignIO::clustalw a bit to make
sure that the sequence name does not exceed the space allocated to it in
the printed alignment. diffs for this is attached as well.

Regards,

Peter Schattner wrote:
> 
> Murad Nayal wrote:
> 
> > is UnivAln being phased out?
> 
> It would be nice if UnivAln were phased out.  But since it still has lots of
> features that some people may be using this doesn't seem likely to happen very
> soon.
> 
> > if not then maybe it is worth it to make
> > UnivAln conform to 'the SimpleAlign interface'. I am guessing this is
> > probably a simple thing to do
> 
> Well it didn't seem simple to me, but take a look at it and if you can see a
> simple way of doing it, do let me know (or better yet,  implement it!  :-)
> 
> - Peter

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884	Fax: 212-305-6926
--------------488DFB5ACD9A6682D9C2A501
Content-Type: text/plain; charset=us-ascii;
 name="clustalw.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="clustalw.diff"

*** /local/lib/perl5/site_perl/5.6.0/Bio/AlignIO/clustalw.pm	Fri Dec 15 23:41:43 2000
--- /local/lib/perl5/site_perl/5.6.0/Bio/AlignIO/clustalw.pm.bk1	Fri Dec 15 23:40:46 2000
***************
*** 133,141 ****
                  $substring = "";
  	}
  
! 	   $self->_print (sprintf("%-22s %s\n",
!              substr($aln->get_displayname($seq->get_nse()),0,20),$substring))
!                or return;
  	}
  	$self->_print (sprintf("\n\n")) or return;
  	$count += 50;
--- 133,139 ----
                  $substring = "";
  	}
  
! 	   $self->_print (sprintf("%-22s %s\n",$aln->get_displayname($seq->get_nse()),$substring)) or return;
  	}
  	$self->_print (sprintf("\n\n")) or return;
  	$count += 50;

--------------488DFB5ACD9A6682D9C2A501
Content-Type: text/plain; charset=us-ascii;
 name="UnivAln.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="UnivAln.diff"

*** /local/lib/perl5/site_perl/5.6.0/Bio/UnivAln.pm	Fri Dec 15 23:42:17 2000
--- /local/lib/perl5/site_perl/5.6.0/Bio/UnivAln.pm.bk1	Mon Oct  2 17:20:59 2000
***************
*** 3368,3416 ****
    print "Caught internal error";
  }
  
- # the following subs were added 12/15/2000 (Murad Nayal)
- 
- sub length_aln() {
-   my $self = shift;
-   return $self->width();
- }
- 
- sub get_displayname() {
-   my $self = shift;
-   my $name = shift;
-   my $id   = $self->id();
-   if(defined($id) && $id ne "_") {
-     return $id;
-   } else                         {
-     return $name;
-   }
- }
- 
- sub eachSeq() {
-   my $self = shift;
- 
-   my @seqStrings = map {join("",@$_)} $self->seqs();
-   my $seqIds     = $self->row_ids();
-   my @seqs;
-   foreach my $seqIdx (0..$#seqStrings) {
-     push(@seqs,Bio::LocatableSeq->new('-seq' => $seqStrings[$seqIdx],
-                                       '-id'  => $$seqIds   [$seqIdx] ));
-   }
-   return @seqs;
- }
- 
- sub getSimpleAlign() {
-   my $self       = shift;
-   my $aln        = Bio::SimpleAlign->new();
-   my @seqStrings = map {join("",@$_)} $self->seqs();
-   my $seqIds     = $self->row_ids();
-   foreach my $seqIdx (0..$#seqStrings) {
-     $aln->addSeq(Bio::LocatableSeq->new('-seq' => $seqStrings[$seqIdx],
-                                         '-id'  => $$seqIds   [$seqIdx] ));
-   }
-   return $aln;
- }
- 
  1;
  __END__
  
--- 3368,3373 ----

--------------488DFB5ACD9A6682D9C2A501--