[Bioperl-l] Mutation IO
Chris Fields
cjfields at uiuc.edu
Thu Jan 18 15:56:42 UTC 2007
I haven't dabbled with the Mutation/Variation stuff, but couldn't one
use a reference sequence (as Heikki suggests) and then use
SeqFeatures for the alleles? You could tag the seqfeature with the
allele name for downstream work. You could maybe add a SeqIO writer
(Jason's suggestion) or just add a helper sub to Bio::SeqUtils for
converting any variation data in a Bio::SeqI into the string you
want, based on allele(s) you specify and the Seq object.
While working on Location stuff, I noticed this is how variations are
represented in normal GenBank files, using the primary feature tag of
'variation' or 'misc_difference' (I think there are a few others):
http://tinyurl.com/22coeq
Using SeqFeatures also allows for deletions/insertions:
http://tinyurl.com/2e2egw
http://tinyurl.com/277a6g
chris
On Jan 18, 2007, at 1:39 AM, Heikki Lehvaslaiho wrote:
> Marian,
>
> Do not try to cram too much into one class. BIC format is
> apparently a useful
> shorthand for some cases, but representing that in the memory using
> objects
> in an expandable way is an other thing.
>
> Your example below describes an individual's diploid genotype.
> Putting that
> into one sequence object is not a good idea. The way to model that
> is to have
> a reference sequence and then define an individual that has that
> sequence in
> diploid or haploid (sex chromosomes) setting and list the alleles
> that person
> has in the reference sequence coordinate system. You might be
> interested in
> separating the alleles by chromosomes, too.
>
> Representing, reporting and modelling genotype information is
> something that
> has been of interest for me and a group of other people for some
> time. An
> early draft of a web site about a genotyping standard can be found
> here:
> http://www.openpml.org. It being worked on heavily and more
> material will be
> added soon.
>
> -Heikki
>
>
> On Thursday 18 January 2007 01:31, marian thieme wrote:
>> Jason, your right, probably it is some kind of abuse of the
>> bioperl api,
>> but its a very quick way to get results, because I dont need to
>> cope with
>> replacing substrings. On the other hand, if you are using the
>> Root.pm class
>> in other scripts, it can probably cause some malfunction
>> (inclusive crash
>> of your application). Probably its no big matter to provide a
>> filestream IO
>> class which is reading/writing the sequence and translates the in/
>> from
>> IUPAC chars. But one thing I dont see at present: How would you
>> represent
>> more complex mutations, as change of few bases ? Ok here we could
>> represent
>> each position seperatly. But in the case of a mutation ? I dont
>> know if
>> there is a iupac char which treats a mutation ! Lets consider this
>> case:
>> 1.) origin of some position is a
>> 2.) some individual has in one locus an a and the other is missing
>> that
>> base or perhaps both loci are missing the a. so via BIC notation
>> you can
>> write [a/_] resp. [_/_]. Any idea how to resolve this ?
>>
>> Marian
>>
>>> Von: Jason Stajich <jason at bioperl.org>
>>> An: marian thieme <marian.thieme at lycos.de>
>>> Betreff: Re: [Bioperl-l] Bio::Root::Root/Bio::LiveSeq::Mutation
>>> Datum: Wed, 17 Jan 2007 08:40:45 -0800
>>>
>>> I think you are ignoring the fact that errors are thrown for a
>>> reason, not just to annoy you.
>>>
>>> Why not store the data in Bio::Seq objects as IUPAC ambiguity codes
>>> and write a special writer class in Bio::SeqIO which converts the
>>> ambiguity codes to your specified encoding.
>>> There are examples of how to write your own Bio::SeqIO class in the
>>> HOWTO tutorials when we talk about extending the toolkit. There is
>>> also all the code to decompose an ambiguity code into the bases it
>>> represents.
>>>
>>>
>>> -jason
>>>
>>> On Jan 16, 2007, at 2:20 AM, marian thieme wrote:
>>>> Hi, as I told to this list some time ago, I want to ouput
>>>> heterozygous dna sequences of different individuals.
>>>> We need to output variations in the following manner:
>>>> [a/g] if there is a loci where one allele has an "a" and the other
>>>> has a "g". (Also known as BIC db format or something like this)
>>>> My approach is to use the Bio::LiveSeq::Mutation (class ?) to
>>>> change the specific position in the sequence.
>>>>
>>>>
>>>> Bio::SeqUtils->mutate($seqobj, Bio::LiveSeq::Mutation->new(
>>>> -seq => "[a/g]",
>>>> -seqori => $seqori,
>>>> -pos => $pos,
>>>> -len => $length));
>>>>
>>>> But unfortunatly this would rise an exception, that some unexpected
>>>> chars occur. Hence I went in to the code of Root.pm and made a
>>>> small change: commenting out line 359 in Root.pm :
>>>>
>>>> if( $ERRORLOADED ) {
>>>> # print STDERR " Calling Error::throw\n\n";
>>>>
>>>> # Enable re-throwing of Error objects.
>>>> # If the error is not derived from Bio::Root::Exception,
>>>> # we can't guarantee that the Error's value was set properly
>>>> # and, ipso facto, that it will be catchable from an eval{}.
>>>> # But chances are, if you're re-throwing non-
>>>> Bio::Root::Exceptions,
>>>> # you're probably using Error::try(), not eval{}.
>>>> # TODO: Fix the MSG: line of the re -thrown error. Has an
>>>> extra line
>>>> # containing the '----- EXCEPTION -----' banner.
>>>> if( ref($args[0])) {
>>>> if( $args[0]->isa('Error')) {
>>>> my $class = ref $args[0];
>>>> $class->throw( @args );
>>>> } else {
>>>> my $text .= "\nWARNING: Attempt to throw a non-
>>>> Error.pm object: " . ref$args[0];
>>>> my $class = "Bio::Root::Exception";
>>>> $class->throw( '-text' => $text, '-value' => $args
>>>> [0] );
>>>> }
>>>> } else {
>>>> $class ||= "Bio::Root::Exception";
>>>>
>>>> my %args;
>>>> if( @args % 2 == 0 && $args[0] =~ /^-/ ) {
>>>> %args = @args;
>>>> $args{-text} = $text;
>>>> $args{-object} = $self;
>>>> }
>>>>
>>>> (Line 359:) #$class->throw( scalar keys %args > 0 ? %args :
>>>> @args ); # (%args || @args) puts %args in scalar context!
>>>> &nbs p; }
>>>> }
>>>>
>>>>
>>>> After I did alter this line all is working fine. But I know that
>>>> this can be considered in the best case as a work around.
>>>>
>>>> 2 Questions:
>>>>
>>>> Do you think it is worth to provide some class which are natively
>>>> able to cope with that matter ?
>>>> Do I need to expect some unwanted behavior of some scripts resp.
>>>> classes ?
>>>>
>>>> Regards,
>>>> Marian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _________________________________
>>>> Stelle Deine Fragen bei Lycos iQ <a
>>>
>>> href=http://iq.lycos.de/qa/ask/>http://iq.lycos.de/qa/ask/</a>>
>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> <a
>>>
>>> href=http://lists.open-bio.org/mailman/listinfo/bioperl-l>http://
>>> lists.op
>>> en- bio.org/mailman/listinfo/bioperl-l</a>
>>> --
>>> Jason Stajich
>>> Miller Research Fellow
>>> University of California, Berkeley
>>> lab: 510.642.8441
>>> <a
>>> href=http://pmb.berkeley.edu/~taylor/people/js.html>http://
>>> pmb.berkeley.e
>>> du/ ~taylor/people/js.html</a>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> <a
>>> href=http://lists.open-bio.org/mailman/listinfo/bioperl-l>http://
>>> lists.op
>>> en- bio.org/mailman/listinfo/bioperl-l</a>
>>
>> Schnell und einfach ohne Anschlusswechsel zur Lycos DSL Flatrate
>> wechseln
>> und 3 Monate kostenlos ab effektiven 5,21 EUR pro Monat im ersten
>> Jahr
>> surfen.
>> http://www.lycos.de/startseite/online/dsl/index.html?
>> prod=DSL&trackingID=em
>> ail_footertxt
>
> --
> ______ _/ _/_____________________________________________________
> _/ _/
> _/ _/ _/ Heikki Lehvaslaiho heikki at_sanbi _ac _za
> _/_/_/_/_/ Associate Professor skype: heikki_lehvaslaiho
> _/ _/ _/ SANBI, South African National Bioinformatics Institute
> _/ _/ _/ University of Western Cape, South Africa
> _/ Phone: +27 21 959 2096 FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list