[bioperl-l] changed Bio::SeqIO::scf
Jason Stajich
jason@cgt.mc.duke.edu
Thu, 11 Apr 2002 11:16:18 -0400 (EDT)
Eckhard-
Thanks for sharing your solution! I'll let Chad Matsalla - the scf
module author decide if/how he might like to incorporate these into the
code.
-jason
On Thu, 11 Apr 2002 ecky@e-lehmann.de wrote:
> Hi,
>
> I made some changes to the Bio::SeqIO::scf module for myself because I
> need to write base variants to scf files.
>
> When I now pass a SeqWithQuality object to the
> Bio::SeqIO::scf::write_seq() method that contains a sequence string with
> IUPAC coded letters, the .scf file is written so that the traces overlap
> at every point where a variant (a M,R,W,S,T,Y,K,V,H,D or B) appears in
> the sequence string. The scf module of bioperl 1.0 wasn't able to write
> .scf files this way before.
> If I run phredPhrap and consed on these .scf files, consed will
> recognise heterocygote variants at the special mentioned positions.
>
> For the changes I took bioperl 1.0, not an cvs version, and it is the
> very first time I changed resp. extended the bioperl source code.
> So my question: is it worth to take this little change over to the
> official bioperl cvs version? If not, don't regard this mail...
>
> The code for the changed scf module can be found at
> http://e-lehmann.de/perl/scf.pm
> At the end of this mail I explain shortly what I have done to the
> module.
>
>
> Eckhard
>
>
>
> Here is what I changed:
>
> sub _set_binary_tracesbases
> There is a section in this method that contains 5 times almost the same
> if statement, which begins with:
> if ($current_base eq "A"){ ... }
> It is for setting up the trace-peaks for every base. I took all those if
> statements out and created a extra sub routine:
> ----
> sub _set_base_traces {
> my ($self, $base, $place_at, $peak_qual, $is_variant) = @_;
> $base =~ tr/A-Z/a-z/;
>
> my $half_ramp = int($self->{'text'}->{'ramp_width'}/2);
> my $ramp_pos = $place_at - $half_ramp;
> for (my $i = 0; $i < $self->{'text'}->{'ramp_width'}; $i++) {
> $self->{'text'}->{"samples_$base"}[$ramp_pos+$i] = $peak_qual *
> $self->{'text'}->{'ramp'}[$i];
> }
>
> if ($base eq "a") {push
> @{$self->{'text'}->{'v2_bases'}},($place_at+1,$peak_qual,0,0,0,$base,0,0,0);}
> elsif ($base eq "c") {push
> @{$self->{'text'}->{'v2_bases'}},($place_at+1,0,$peak_qual,0,0,$base,0,0,0);}
> elsif ($base eq "g") {push
> @{$self->{'text'}->{'v2_bases'}},($place_at+1,0,0,0,$peak_qual,$base,0,0,0);}
> elsif ($base eq "t") {push
> @{$self->{'text'}->{'v2_bases'}},($place_at+1,$peak_qual,0,0,0,$base,0,0,0);}
>
> push @{$self->{'text'}->{"v3_base_accuracy_$base"}},$peak_qual unless
> $is_variant;
>
> if (!$is_variant) {
> my $ba = "acgt";
> $ba =~ s/$base//;
> my @baarray = split(//,$ba);
>
> my $b;
> foreach $b(@baarray) {
> push @{$self->{'text'}->{"v3_base_accuracy_$b"}},0;
> }
> }
> }
> ---------
>
> furthermore I call the routine in _set_binary_tracesbases() with if
> statements for every (M,R,W,S,T,Y,K,V,H,D,B) like this:
> ----------
> if ($current_base eq "M") {
> #modify the peak_quality for the ambiguous base at this point
> #this time it is 50% subtracted from the peak_quality for the
> major allele base
> #the line is very similar in all the following if statements...
> my $mod_peak_qual = int ($peak_quality * (1 - 0.5));
>
> $self->_set_base_traces("A", $place_base_at, $peak_quality);
> $self->_set_base_traces("C", $place_base_at, $mod_peak_qual, 1);
> }
> ---------
> and at least
> ---------
> #no heterocygote variant at this position in the sequence
> else {$self->_set_base_traces($current_base, $place_base_at,
> $peak_quality);}
> ---------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
jason@cgt.mc.duke.edu