[Bioperl-l] Extract Mutation Automatically
Andrew Leung
ro_phls2 at dh.gov.hk
Wed Aug 10 20:42:11 EDT 2005
Hi Jason,
Thank you for advice. I will try the various approaches suggested. My
ultimate goal is to extract something like this: A267G, Z786-, L898Y etc.
for aa and A162T, G339A, A388N, etc. for nt. Preferably, the nomenclature
for annotating mutations is a standardized one. But, it appears that there
no such a ready to use module from Bioperl.
Andrew
-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at duke.edu]
Sent: Wednesday, August 10, 2005 10:36 AM
To: andrew_leung at dh.gov.hk
Cc: bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Extract Mutation Automatically
I guess it comes down to what you want to do with the mutations once
you've found them.
The seq_inds method in Bio::Search::HSP::HSPI which is something you
can call on hsp objects you've gotten out of pairwise alignment
searches. seq_inds will give you the location of the identical,
conserved, mismatched columns from a pairwise alignment. I would
suggest using FASTA or SSEARCH and
If you had two files with seqs to align called 'seq1.fa' and 'seq2.fa'
Here is how I would get the pairwise SW alignment and get the
mutations out.
If you wanted a global alignment you can use the EMBOSS tool 'needle'
and generate an MSF alignment which can be parsed with Bio::AlignIO.
some simple code to print out the bases which have mismatches
use Bio::SearchIO;
use strict;
my $fh;
#open($fh, "bl2seq -i seq1.fa -j seq2.fa -p blastn |") || die $!;
open($fh, "fasta34 seq1.fa seq2.fa |") || die $!;
#my $parser = Bio::SearchIO->new(-format => 'fasta',
# -fh => $fh);
my $parser = Bio::SearchIO->new(-format => 'blast',
-
fh => $fh);
if( my $result = $parser->next_result ) { # single result so use if
instead of while
if( my $hit = $result->next_hit ) { # ditto, want single
result...
if( my $hsp = $hit->next_hsp ) { # single HSP from FASTA, would
need to consider more if using BLAST
my (@qmismatches) = $hsp->seq_inds('hit', 'nomatch');
# if this is protein and you want to treat the conservative
matches as mismatches
# you'll need to run the same method but asking for
'conserved' and then combing the two lists
for my $base ( @qmismatches ) {
print "base $base of the hit sequence is a mismatch \n",
}
}
}
}
The Bio::PopGen::Utilities module can also take an alignment and
extract the positions with variation for use in polymorphism analyses.
-jason
On Aug 9, 2005, at 8:34 PM, Andrew Leung wrote:
> Hi all,
> Is there any module available that can allow me to extract mutation(s)
> automatically? The idea is that if I submit two sequences for
> alignment, the
> script can automatically list out all the differences between the two
> sequences. I wish to know the difference at two levels, i.e. the
> nucleotide
> and amino acid level. Any ideas?
> Andrew
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
http://www.duke.edu/~jes12
More information about the Bioperl-l
mailing list