[Bioperl-l] Bio::Tools::SeqStats to count bases

Adam Sjøgren asjo at koldfront.dk
Mon Feb 4 17:58:25 UTC 2013


On Mon, 4 Feb 2013 07:39:19 -0800 (PST), Slym wrote:

> #! /usr/bin/perl

> use Bio::Tools::SeqStats;
> use Bio::Seq;

It can be a good idea to add "use strict; use warnings;" to the top of
your script. At least two problems in your program would have been
caught by perl if you had.

> open (FILE, "seq.fasta");

Using (global) literal filehandles and the two parameter open() is
somewhat outdated, a more current way to do it could be:

  open my $fh, '<', 'seq.fasta';

> @array = <FILE>;

> # Removing first line of fasta

> shift (@array);
> $array = join('', at array);
> open (FILE2, ">>seq2.fasta");
> print FILE2 "$array";

Note that you are writing just the sequence to your seq2.fasta file
here, so the new file isn't really a fasta file.

> $seqobj = Bio::PrimarySeq->new( -file => "sekw2.fasta",
> - alphabet => 'dna',);

Bio::PrimarySeq doesn't take a '-file' parameter. Also, note that the
filename is different than before "sekw2" vs. "seq2"!

Either you should use Bio::SeqIO with a '-file' parameter, or you can
use Bio::PrimarySeq with a '-seq' parameter.

> my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);

> my $monomer_ref = $seq_stats->count_monomers();

> foreach $base (sort keys %$monomer_ref) {
> print "Liczba zasad typu ", $base," = ", $monomer_ref{$base},"\n";

Here you wanted $monomer_ref->{$base}, as %monomer_ref isn't mentioned
anywhere else.

> }

Here is a complete version of your script - I chose to use Bio::SeqIO -
that works:

  #!/usr/bin/perl

  use strict;
  use warnings;

  use Bio::SeqIO;
  use Bio::Tools::SeqStats;

  my $io=Bio::SeqIO->new(-file=>'seq.fasta', -alphabet=>'dna');
  my $seqobj=$io->next_seq; # Get the first sequence from the file

  my $seq_stats = Bio::Tools::SeqStats->new(-seq=>$seqobj);
  my $monomer_ref = $seq_stats->count_monomers();
  foreach my $base (sort keys %$monomer_ref) {
      print "Liczba zasad typu ", $base," = ", $monomer_ref->{$base},"\n";
  }

E.g.:

  $ cat seq.fasta
  >test
  aaaacccggt
  $ ./slym.pl 
  Liczba zasad typu A = 4
  Liczba zasad typu C = 3
  Liczba zasad typu G = 2
  Liczba zasad typu T = 1
  $ 


  Best regards,

    Adam

-- 
 "Grittings. Ma nam is Kahlfin."                              Adam Sjøgren
                                                         asjo at koldfront.dk




More information about the Bioperl-l mailing list