[Bioperl-l] Extract Mutation Automatically
Cook, Malcolm
MEC at Stowers-Institute.org
Thu Aug 11 12:38:06 EDT 2005
re: standardized nomenclature for mutations, see
Recommendations for a nomenclature system for human gene mutations
a copy of which can be found
http://www.google.com/url?sa=t&ct=res&cd=1&url=http%3A//mecp2.chw.edu.au
/mecp2/info/mutation_nomenclature_1.pdf&ei=6H37QsvRGo34igGj4vBS&sig2=eYV
DZb467rYOBf-0sCtctQ
-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org] On Behalf Of Andrew Leung
Sent: Wednesday, August 10, 2005 7:42 PM
To: 'Jason Stajich'
Cc: bioperl-l at bioperl.org
Subject: RE: [Bioperl-l] Extract Mutation Automatically
Hi Jason,
Thank you for advice. I will try the various approaches suggested. My
ultimate goal is to extract something like this: A267G, Z786-, L898Y
etc.
for aa and A162T, G339A, A388N, etc. for nt. Preferably, the
nomenclature
for annotating mutations is a standardized one. But, it appears that
there
no such a ready to use module from Bioperl.
Andrew
-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at duke.edu]
Sent: Wednesday, August 10, 2005 10:36 AM
To: andrew_leung at dh.gov.hk
Cc: bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Extract Mutation Automatically
I guess it comes down to what you want to do with the mutations once
you've found them.
The seq_inds method in Bio::Search::HSP::HSPI which is something you
can call on hsp objects you've gotten out of pairwise alignment
searches. seq_inds will give you the location of the identical,
conserved, mismatched columns from a pairwise alignment. I would
suggest using FASTA or SSEARCH and
If you had two files with seqs to align called 'seq1.fa' and 'seq2.fa'
Here is how I would get the pairwise SW alignment and get the
mutations out.
If you wanted a global alignment you can use the EMBOSS tool 'needle'
and generate an MSF alignment which can be parsed with Bio::AlignIO.
some simple code to print out the bases which have mismatches
use Bio::SearchIO;
use strict;
my $fh;
#open($fh, "bl2seq -i seq1.fa -j seq2.fa -p blastn |") || die $!;
open($fh, "fasta34 seq1.fa seq2.fa |") || die $!;
#my $parser = Bio::SearchIO->new(-format => 'fasta',
# -fh => $fh);
my $parser = Bio::SearchIO->new(-format => 'blast',
-
fh => $fh);
if( my $result = $parser->next_result ) { # single result so use if
instead of while
if( my $hit = $result->next_hit ) { # ditto, want single
result...
if( my $hsp = $hit->next_hsp ) { # single HSP from FASTA, would
need to consider more if using BLAST
my (@qmismatches) = $hsp->seq_inds('hit', 'nomatch');
# if this is protein and you want to treat the conservative
matches as mismatches
# you'll need to run the same method but asking for
'conserved' and then combing the two lists
for my $base ( @qmismatches ) {
print "base $base of the hit sequence is a mismatch \n",
}
}
}
}
The Bio::PopGen::Utilities module can also take an alignment and
extract the positions with variation for use in polymorphism analyses.
-jason
On Aug 9, 2005, at 8:34 PM, Andrew Leung wrote:
> Hi all,
> Is there any module available that can allow me to extract mutation(s)
> automatically? The idea is that if I submit two sequences for
> alignment, the
> script can automatically list out all the differences between the two
> sequences. I wish to know the difference at two levels, i.e. the
> nucleotide
> and amino acid level. Any ideas?
> Andrew
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
http://www.duke.edu/~jes12
_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list