[Bioperl-l] Another GuessSeqFormat question

Tim Erwin taerwin at tpg.com.au
Wed Aug 17 19:18:33 EDT 2005


Thanks, Heikki, but I am trying to parse different IO objects such as
AlignIO, SeqIO and SearchIO, but what I am trying to do is guess the
format of any IO object and then use the appropriate parser.

i.e If I have a unknown file output.out I want to guess the format and
then the appropriate IO parser to use. Is there a way to do this or
should I just test all the IO parsers with an eval block.

Regards,

Tim


On Wed, 2005-08-17 at 10:03 +0100, Heikki Lehvaslaiho wrote:
> 
> Tim,
> 
> Bio::Tools::GuessSeqFormat is not meant to be used directly. It is called 
> automatically by the constructor (new() method) of Bio::SeqIO:
> 
>  my $format = $param{'-format'} ||
>      $class->_guess_format( $param{-file} || $ARGV[0] );
> 
>  if( ! $format ) { 
>      if ($param{-file}) {
>   $format = Bio::Tools::GuessSeqFormat->new(-file => $param{-file}||
>                     $ARGV[0] )->guess;
>      } elsif ($param{-fh}) {
>   $format = Bio::Tools::GuessSeqFormat->new(-fh => $param{-fh}||
>                     $ARGV[0] )->guess;
>      }
>  }
>         # ... code removed
>  return "Bio::SeqIO::$format"->new(@args);
> 
> The logic from the above code is as follows:
> 
> 1. _guess_format() tries to determine the format of the file based on the 
> filename extension.
> 
> 2. Only if that fails try looking into the file/stream to guess the format 
> using the Bio::Tools::GuessSeqFormat code.
> 
> 3. The returned object is not a Bio::SeqIO but a Bio::SeqIO::$format object, 
> which has the correct next_seq() and write_seq() methods. You can therefore 
> use ref($seqoobject) to find out what parser is being used.
> 
> 
> 
> The standard code for doing this should contain all the automation needed:
>   
> foreach my $inputfilename (@all_files) {
>     my $in  = Bio::SeqIO->new(-file => $inputfilename);
>     while ( my $seq = $in->next_seq() ) {
>      # do something
>     }
> }
> 
> 
> Yours,
>        -Heikki
> 
> 
> On Wednesday 17 August 2005 08:15, Tim Erwin wrote:
> > Hi,
> >
> > Is there a way to determine which parser to use based on the guess from
> > Bio::Tools::GuessSeqFormat without hard coding a hash? I am interested
> > in parsing and storing various files to a database.
> >
> > I was wondering if it is a good idea to make a some extra functions so that
> > files could be parsed automatically.
> >
> > i.e for a fasta file
> >
> > my $obj = new Bio::Tools::GuessSeqFormat( -file => $filename );
> > my $format = $obj->guess;
> > my $parser = $obj->parser;              #RETURNS Bio::SeqIO
> > my $next_method = $obj->next_method;    #RETURNS next_seq
> > my $write_method = $obj->write_method;  #RETURNS write_seq
> >
> > #PARSE FILE
> > my $infile = new $parser(-file => $filename, -format => $format);
> > while (my $result = $infile->$next_method) {
> >
> >   #DO STUFF HERE
> >   #ADD $result TO DATABASE
> >
> > }
> >
> > Perhaps there is a better way to do this? Any suggestions would be great.
> >
> > Regards,
> >
> > Tim
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 



More information about the Bioperl-l mailing list