[bioperl-l] changed Bio::SeqIO::scf

Jason Stajich jason@cgt.mc.duke.edu
Thu, 11 Apr 2002 11:16:18 -0400 (EDT)


Eckhard-

Thanks for sharing your solution!  I'll let Chad Matsalla - the scf
module author decide if/how he might like to incorporate these into the
code.

-jason

On Thu, 11 Apr 2002 ecky@e-lehmann.de wrote:

> Hi,
>
> I made some changes to the Bio::SeqIO::scf module for myself because I
> need to write base variants to scf files.
>
> When I now pass a SeqWithQuality object to the
> Bio::SeqIO::scf::write_seq() method that contains a sequence string with
> IUPAC coded letters, the .scf file is written so that the traces overlap
> at every point where a variant (a M,R,W,S,T,Y,K,V,H,D or B) appears in
> the sequence string. The scf module of bioperl 1.0 wasn't able to write
> .scf files this way before.
> If I run phredPhrap and consed on these .scf files, consed will
> recognise heterocygote variants at the special mentioned positions.
>
> For the changes I took bioperl 1.0, not an cvs version, and it is the
> very first time I changed resp. extended the bioperl source code.
> So my question: is it worth to take this little change over to the
> official bioperl cvs version? If not, don't regard this mail...
>
> The code for the changed scf module can be found at
> http://e-lehmann.de/perl/scf.pm
> At the end of this mail I explain shortly what I have done to the
> module.
>
>
> Eckhard
>
>
>
> Here is what I changed:
>
> sub _set_binary_tracesbases
> There is a section in this method that contains 5 times almost the same
> if statement, which begins with:
> if ($current_base eq "A"){ ... }
> It is for setting up the trace-peaks for every base. I took all those if
> statements out and created a extra sub routine:
> ----
> sub _set_base_traces {
>   my ($self, $base, $place_at, $peak_qual, $is_variant) = @_;
>   $base =~ tr/A-Z/a-z/;
>
>   my $half_ramp = int($self->{'text'}->{'ramp_width'}/2);
>   my $ramp_pos = $place_at - $half_ramp;
>   for (my $i = 0; $i < $self->{'text'}->{'ramp_width'};  $i++) {
>     $self->{'text'}->{"samples_$base"}[$ramp_pos+$i] = $peak_qual *
> $self->{'text'}->{'ramp'}[$i];
>   }
>
>   if ($base eq "a") {push
> @{$self->{'text'}->{'v2_bases'}},($place_at+1,$peak_qual,0,0,0,$base,0,0,0);}
>   elsif ($base eq "c") {push
> @{$self->{'text'}->{'v2_bases'}},($place_at+1,0,$peak_qual,0,0,$base,0,0,0);}
>   elsif ($base eq "g") {push
> @{$self->{'text'}->{'v2_bases'}},($place_at+1,0,0,0,$peak_qual,$base,0,0,0);}
>   elsif ($base eq "t") {push
> @{$self->{'text'}->{'v2_bases'}},($place_at+1,$peak_qual,0,0,0,$base,0,0,0);}
>
>   push @{$self->{'text'}->{"v3_base_accuracy_$base"}},$peak_qual unless
> $is_variant;
>
>   if (!$is_variant) {
>     my $ba = "acgt";
>     $ba =~ s/$base//;
>     my @baarray = split(//,$ba);
>
>     my $b;
>     foreach $b(@baarray) {
>       push @{$self->{'text'}->{"v3_base_accuracy_$b"}},0;
>     }
>   }
> }
> ---------
>
> furthermore I call the routine in _set_binary_tracesbases() with if
> statements for every (M,R,W,S,T,Y,K,V,H,D,B) like this:
> ----------
> if ($current_base eq "M") {
>       #modify the peak_quality for the ambiguous base at this point
>       #this time it is 50% subtracted from the peak_quality for the
> major allele base
>       #the line is very similar in all the following if statements...
>       my $mod_peak_qual = int ($peak_quality * (1 - 0.5));
>
>       $self->_set_base_traces("A", $place_base_at, $peak_quality);
>       $self->_set_base_traces("C", $place_base_at, $mod_peak_qual, 1);
>     }
> ---------
> and at least
> ---------
> #no heterocygote variant at this position in the sequence
>     else {$self->_set_base_traces($current_base, $place_base_at,
> $peak_quality);}
> ---------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu