[Bioperl-l] Another GuessSeqFormat question

Heikki Lehvaslaiho heikki at ebi.ac.uk
Wed Aug 17 05:03:02 EDT 2005



Tim,

Bio::Tools::GuessSeqFormat is not meant to be used directly. It is called 
automatically by the constructor (new() method) of Bio::SeqIO:

 my $format = $param{'-format'} ||
     $class->_guess_format( $param{-file} || $ARGV[0] );

 if( ! $format ) { 
     if ($param{-file}) {
  $format = Bio::Tools::GuessSeqFormat->new(-file => $param{-file}||
                    $ARGV[0] )->guess;
     } elsif ($param{-fh}) {
  $format = Bio::Tools::GuessSeqFormat->new(-fh => $param{-fh}||
                    $ARGV[0] )->guess;
     }
 }
        # ... code removed
 return "Bio::SeqIO::$format"->new(@args);

The logic from the above code is as follows:

1. _guess_format() tries to determine the format of the file based on the 
filename extension.

2. Only if that fails try looking into the file/stream to guess the format 
using the Bio::Tools::GuessSeqFormat code.

3. The returned object is not a Bio::SeqIO but a Bio::SeqIO::$format object, 
which has the correct next_seq() and write_seq() methods. You can therefore 
use ref($seqoobject) to find out what parser is being used.



The standard code for doing this should contain all the automation needed:
  
foreach my $inputfilename (@all_files) {
    my $in  = Bio::SeqIO->new(-file => $inputfilename);
    while ( my $seq = $in->next_seq() ) {
     # do something
    }
}


Yours,
       -Heikki


On Wednesday 17 August 2005 08:15, Tim Erwin wrote:
> Hi,
>
> Is there a way to determine which parser to use based on the guess from
> Bio::Tools::GuessSeqFormat without hard coding a hash? I am interested
> in parsing and storing various files to a database.
>
> I was wondering if it is a good idea to make a some extra functions so that
> files could be parsed automatically.
>
> i.e for a fasta file
>
> my $obj = new Bio::Tools::GuessSeqFormat( -file => $filename );
> my $format = $obj->guess;
> my $parser = $obj->parser;              #RETURNS Bio::SeqIO
> my $next_method = $obj->next_method;    #RETURNS next_seq
> my $write_method = $obj->write_method;  #RETURNS write_seq
>
> #PARSE FILE
> my $infile = new $parser(-file => $filename, -format => $format);
> while (my $result = $infile->$next_method) {
>
>   #DO STUFF HERE
>   #ADD $result TO DATABASE
>
> }
>
> Perhaps there is a better way to do this? Any suggestions would be great.
>
> Regards,
>
> Tim
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_ebi _ac _uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambridge, CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________


More information about the Bioperl-l mailing list