[Bioperl-l] Mutation IO

Mauricio Herrera Cuadra arareko at campus.iztacala.unam.mx
Thu Jan 18 16:25:22 UTC 2007


Folks,

A bit off-topic here: It would be better if we post full URLs instead of 
tinyfied ones. I think of this because tinyurls have an expiration date 
thus leaving a soon-to-expire URL archived in the mailing list and 
making the list archive less useful for future references.

Regards,
Mauricio.

Chris Fields wrote:
> I haven't dabbled with the Mutation/Variation stuff, but couldn't one  
> use a reference sequence (as Heikki suggests) and then use  
> SeqFeatures for the alleles?  You could tag the seqfeature with the  
> allele name for downstream work.  You could maybe add a SeqIO writer  
> (Jason's suggestion) or just add a helper sub to Bio::SeqUtils for  
> converting any variation data in a Bio::SeqI into the string you  
> want, based on allele(s) you specify and the Seq object.
> 
> While working on Location stuff, I noticed this is how variations are  
> represented in normal GenBank files, using the primary feature tag of  
> 'variation' or 'misc_difference' (I think there are a few others):
> 
> http://tinyurl.com/22coeq
> 
> Using SeqFeatures also allows for deletions/insertions:
> 
> http://tinyurl.com/2e2egw
> 
> http://tinyurl.com/277a6g
> 
> 
> chris
> 
> On Jan 18, 2007, at 1:39 AM, Heikki Lehvaslaiho wrote:
> 
>> Marian,
>>
>> Do not try to cram too much into one class. BIC format is  
>> apparently a useful
>> shorthand for some cases, but representing that in the memory using  
>> objects
>> in an expandable way is an other thing.
>>
>> Your example below describes an individual's diploid genotype.  
>> Putting that
>> into one sequence object is not a good idea. The way to model that  
>> is to have
>> a reference sequence and then define an individual that has that  
>> sequence in
>> diploid or haploid (sex chromosomes) setting and list the alleles  
>> that person
>> has in the reference sequence coordinate system. You might be  
>> interested in
>> separating the alleles by chromosomes, too.
>>
>> Representing, reporting and modelling genotype information is  
>> something that
>> has been of interest for me and a group of other people for some  
>> time. An
>> early draft of a web site about a genotyping standard can be found  
>> here:
>> http://www.openpml.org. It being worked on heavily and more  
>> material will be
>> added soon.
>>
>> 	-Heikki
>>
>>
>> On Thursday 18 January 2007 01:31, marian thieme wrote:
>>> Jason, your right, probably it is some kind of abuse of the  
>>> bioperl api,
>>> but its a very quick way to get results, because I dont need to  
>>> cope with
>>> replacing substrings. On the other hand, if you are using the  
>>> Root.pm class
>>> in other scripts, it can probably cause some malfunction  
>>> (inclusive crash
>>> of your application). Probably its no big matter to provide a  
>>> filestream IO
>>> class which is reading/writing the sequence and translates the in/ 
>>> from
>>> IUPAC chars. But one thing I dont see at present: How would you  
>>> represent
>>> more complex mutations, as change of few bases ? Ok here we could  
>>> represent
>>> each position seperatly. But in the case of a mutation ? I dont  
>>> know if
>>> there is a iupac char which treats a mutation ! Lets consider this  
>>> case:
>>> 1.) origin of some position is a
>>> 2.) some individual has in one locus an a and the other is missing  
>>> that
>>> base or perhaps both loci are missing the a. so via BIC notation  
>>> you can
>>> write [a/_] resp. [_/_]. Any idea how to resolve this ?
>>>
>>> Marian
>>>
>>>> Von: Jason Stajich <jason at bioperl.org>
>>>> An: marian thieme <marian.thieme at lycos.de>
>>>> Betreff: Re: [Bioperl-l] Bio::Root::Root/Bio::LiveSeq::Mutation
>>>> Datum: Wed, 17 Jan 2007 08:40:45 -0800
>>>>
>>>> I think you are ignoring the fact that errors are thrown for a
>>>> reason, not just to annoy you.
>>>>
>>>> Why not store the data in Bio::Seq objects as IUPAC ambiguity codes
>>>> and write a special writer class in Bio::SeqIO which converts the
>>>> ambiguity codes to your specified encoding.
>>>> There are examples of how to write your own Bio::SeqIO class in the
>>>> HOWTO tutorials when we talk about extending the toolkit. There is
>>>> also all the code to decompose an ambiguity code into the bases it
>>>> represents.
>>>>
>>>>
>>>> -jason
>>>>
>>>> On Jan 16, 2007, at 2:20 AM, marian thieme wrote:
>>>>> Hi, as I told to this list some time ago, I want to ouput
>>>>> heterozygous dna sequences of different individuals.
>>>>> We need to output variations in the following manner:
>>>>> [a/g] if there is a loci where one allele has an "a" and the other
>>>>> has a "g". (Also known as BIC db format or something like this)
>>>>> My approach is to use the Bio::LiveSeq::Mutation (class ?) to
>>>>> change the specific position in the sequence.
>>>>>
>>>>>
>>>>> Bio::SeqUtils->mutate($seqobj, Bio::LiveSeq::Mutation->new(
>>>>>   -seq => "[a/g]",
>>>>>   -seqori => $seqori,
>>>>>   -pos => $pos,
>>>>>   -len => $length));
>>>>>
>>>>> But unfortunatly this would rise an exception, that some unexpected
>>>>> chars occur. Hence I went in to the code of Root.pm and made a
>>>>> small change: commenting out line 359 in Root.pm :
>>>>>
>>>>> if( $ERRORLOADED ) {
>>>>> #       print STDERR "  Calling Error::throw\n\n";
>>>>>
>>>>>        # Enable re-throwing of Error objects.
>>>>>        # If the error is not derived from Bio::Root::Exception,
>>>>>        # we can't guarantee that the Error's value was set properly
>>>>>        # and, ipso facto, that it will be catchable from an eval{}.
>>>>>        # But chances are, if you're re-throwing non-
>>>>> Bio::Root::Exceptions,
>>>>>        # you're probably using Error::try(), not eval{}.
>>>>>        # TODO: Fix the MSG: line of the re -thrown error. Has an
>>>>> extra line
>>>>>        # containing the '----- EXCEPTION -----' banner.
>>>>>        if( ref($args[0])) {
>>>>>            if( $args[0]->isa('Error')) {
>>>>>                my $class = ref $args[0];
>>>>>                $class->throw( @args );
>>>>>            } else {
>>>>>                my $text .= "\nWARNING: Attempt to throw a non-
>>>>> Error.pm object: " . ref$args[0];
>>>>>                my $class = "Bio::Root::Exception";
>>>>>                $class->throw( '-text' => $text, '-value' => $args
>>>>> [0] );
>>>>>            }
>>>>>        } else {
>>>>>            $class ||= "Bio::Root::Exception";
>>>>>
>>>>>            my %args;
>>>>>            if( @args % 2 == 0 && $args[0] =~ /^-/ ) {
>>>>>                %args = @args;
>>>>>                $args{-text} = $text;
>>>>>                $args{-object} = $self;
>>>>>            }
>>>>>
>>>>> (Line 359:)   #$class->throw( scalar keys %args > 0 ? %args :
>>>>> @args ); # (%args || @args) puts %args in scalar context!
>>>>>  &nbs p;     }
>>>>>    }
>>>>>
>>>>>
>>>>> After I did alter this line all is working fine. But I know that
>>>>> this can be considered in the best case  as a work around.
>>>>>
>>>>> 2 Questions:
>>>>>
>>>>> Do you think it is worth to provide some class which are natively
>>>>> able to cope with that matter ?
>>>>> Do I need to expect some unwanted behavior of some scripts resp.
>>>>> classes ?
>>>>>
>>>>> Regards,
>>>>> Marian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _________________________________
>>>>> Stelle Deine Fragen bei Lycos iQ <a
>>>> href=http://iq.lycos.de/qa/ask/>http://iq.lycos.de/qa/ask/</a>>
>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> <a
>>>> href=http://lists.open-bio.org/mailman/listinfo/bioperl-l>http:// 
>>>> lists.op
>>>> en- bio.org/mailman/listinfo/bioperl-l</a>
>>>> --
>>>> Jason Stajich
>>>> Miller Research Fellow
>>>> University of California, Berkeley
>>>> lab: 510.642.8441
>>>> <a
>>>> href=http://pmb.berkeley.edu/~taylor/people/js.html>http:// 
>>>> pmb.berkeley.e
>>>> du/ ~taylor/people/js.html</a>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> <a
>>>> href=http://lists.open-bio.org/mailman/listinfo/bioperl-l>http:// 
>>>> lists.op
>>>> en- bio.org/mailman/listinfo/bioperl-l</a>
>>> Schnell und einfach ohne Anschlusswechsel zur Lycos DSL Flatrate  
>>> wechseln
>>> und 3 Monate kostenlos ab effektiven 5,21 EUR pro Monat im ersten  
>>> Jahr
>>> surfen.
>>> http://www.lycos.de/startseite/online/dsl/index.html? 
>>> prod=DSL&trackingID=em
>>> ail_footertxt
>> -- 
>> ______ _/      _/_____________________________________________________
>>       _/      _/
>>      _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>     _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
>>    _/  _/  _/  SANBI, South African National Bioinformatics Institute
>>   _/  _/  _/  University of Western Cape, South Africa
>>      _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>> ___ _/_/_/_/_/________________________________________________________
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Genética
Unidad de Morfofisiología y Función
Facultad de Estudios Superiores Iztacala, UNAM




More information about the Bioperl-l mailing list