[Bioperl-l] get the sequence of a column in a multiple alignment
Albert Vilella
avilella at gmail.com
Wed Feb 14 15:29:02 UTC 2007
there is a slice method:
$mini_aln = $aln->slice(20,30); # get a block of columns
Title : slice
Usage : $aln2 = $aln->slice(20,30)
Function : Creates a slice from the alignment inclusive of start and
end columns, and the first column in the alignment is denoted 1.
Sequences with no residues in the slice are excluded from the
new alignment and a warning is printed. Slice beyond the length of
the sequence does not do padding.
Returns : A Bio::SimpleAlign object
Args : Positive integer for start column, positive integer for end column,
optional boolean which if true will keep gap-only columns
in the newly
created slice. Example:
$aln2 = $aln->slice(20,30,1)
but I don't know how well it behaves for lots of sequences :)
On 2/14/07, Mathieu Rouard <mrouard at gmail.com> wrote:
> Dear all,
>
> I am starting to use the bioperl API to parse multiple alignments and I am
> wondering what is the most effective way to splice all the columns from an
> alignment (all the AA at the postion 1, position 2 etc.). I quickly
> implemented this simple code but it becomes quite slow when the length of
> sequences increases.
>
> my $stream = Bio::AlignIO->new(-file => $inputfilename,
> '-format' => 'stockholm');
>
> my $aln = $stream->next_aln();
>
> my $length = $aln->length();
> my %column;
>
> for (my $i=1;$i<=$length;$i++) {
> my $aa;
> foreach my $seq ($aln->each_seq()) {
> my $obj = $seq->trunc($i,$i);
> $aa .=$obj->seq;
> }
> # need to track the column number and the sequence of the column
> push $column, $aa;
> }
>
> Would you have any other suggestion?
>
> thanks
> Mathieu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list