[Bioperl-l] Extract Mutation Automatically

Heikki Lehvaslaiho heikki at ebi.ac.uk
Tue Aug 16 05:06:07 EDT 2005


Andrew,

You are right, Bio::Variation objects only store and format the findings.
This same question popped up a couple of months ago. See:

http://portal.open-bio.org/pipermail/bioperl-l/2005-June/019242.html

I wonder if Julio got round to writing the code?

 -Heikki

On Tuesday 16 August 2005 01:49, Andrew Leung wrote:
> Hi Heikki,
> Thank you for your note.
> I now have two strands of sequences obtained from a hsp and an array of
> mutation position information resulted from seq_inds() with 'mismatch'
> option. Do you mean that I can put these data to Bio::Variation and
> generate a mutation list as desired? I am quite new to Bioperl. Can you
> explain in greater details? I've read the documentation for Bio::Variation,
> but it appears to me that its methods are mainly for "set", but not for
> "reading" mutation.
> Andrew
>
>
>
> = = = = = = = = = =
> Andrew,
>
> Once you have extracted the information, you can create Bio::Variation
> objects
> which know how to stringify the description according to human mutation
> nomenclature rules.
>
> In practise, you create a Bio::Variation::SeqDiff object, add to it the
> appropriate Bio::Variation::{DNAMutation|RNAChange|AAChange} objects and
> call
> methods sysname() for nucleotides descriptor or trivname() for amino acid
> descriptor.
>
> The nomenclature used is not the most recent complex suggestion from den
> Dunnen et al but original (and in basic cases identical) from Antonorakis
> et
>
> al.
>
>  -Heikki
>
> On Thursday 11 August 2005 01:42, Andrew Leung wrote:
> > Hi Jason,
> > Thank you for advice. I will try the various approaches suggested. My
> > ultimate goal is to extract something like this: A267G, Z786-, L898Y etc.
> > for aa and A162T, G339A, A388N, etc. for nt. Preferably, the nomenclature
> > for annotating mutations is a standardized one. But, it appears that
> > there no such a ready to use module from Bioperl.
> > Andrew
> >
> >
> > -----Original Message-----
> > From: Jason Stajich [mailto:jason.stajich at duke.edu]
> > Sent: Wednesday, August 10, 2005 10:36 AM
> > To: andrew_leung at dh.gov.hk
> > Cc: bioperl-l at bioperl.org
> > Subject: Re: [Bioperl-l] Extract Mutation Automatically
> >
> > I guess it comes down to what you want to do with the mutations once
> > you've found them.
> >
> > The seq_inds method in Bio::Search::HSP::HSPI  which is something you
> > can call on hsp objects you've gotten out of pairwise alignment
> > searches. seq_inds will give you the location of the identical,
> > conserved, mismatched columns from a pairwise alignment.  I would
> > suggest using FASTA or SSEARCH and
> >
> > If you had two files with seqs to align called 'seq1.fa' and 'seq2.fa'
> >
> > Here is how I would get the pairwise SW alignment and get the
> > mutations out.
> >
> > If you wanted a global alignment you can use the EMBOSS tool 'needle'
> > and generate an MSF alignment which can be parsed with Bio::AlignIO.
> >
> > some simple code to print out the bases which have mismatches
> > use Bio::SearchIO;
> > use strict;
> > my $fh;
> > #open($fh, "bl2seq -i seq1.fa -j seq2.fa -p blastn |") || die $!;
> > open($fh, "fasta34 seq1.fa seq2.fa  |") || die $!;
> > #my $parser = Bio::SearchIO->new(-format => 'fasta',
> > #                -fh     => $fh);
> > my $parser = Bio::SearchIO->new(-format => 'blast',
> >                                                                 -
> > fh        => $fh);
> >
> > if( my $result = $parser->next_result ) { # single result so use if
> > instead of while
> >      if( my $hit = $result->next_hit ) {    # ditto, want single
> > result...
> >      if( my $hsp = $hit->next_hsp ) { # single HSP from FASTA, would
> > need to consider more if using BLAST
> >
> >          my (@qmismatches) = $hsp->seq_inds('hit', 'nomatch');
> >          # if this is protein and you want to treat the conservative
> > matches as mismatches
> >          # you'll need to run the same method but asking for
> > 'conserved' and then combing the two lists
> >
> >          for my $base ( @qmismatches ) {
> >             print "base $base of the hit sequence is a mismatch \n",
> >         }
> >      }
> >      }
> > }
> >
> >
> > The Bio::PopGen::Utilities module can also take an alignment and
> > extract the positions with variation for use in polymorphism analyses.
> >
> > -jason
> >
> > On Aug 9, 2005, at 8:34 PM, Andrew Leung wrote:
> > > Hi all,
> > > Is there any module available that can allow me to extract mutation(s)
> > > automatically? The idea is that if I submit two sequences for
> > > alignment, the
> > > script can automatically list out all the differences between the two
> > > sequences. I wish to know the difference at two levels, i.e. the
> > > nucleotide
> > > and amino acid level. Any ideas?
> > > Andrew
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > Jason Stajich
> > Duke University
> > http://www.duke.edu/~jes12
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_ebi _ac _uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambridge, CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________


More information about the Bioperl-l mailing list