[Bioperl-l] Another GuessSeqFormat question
Tim Erwin
taerwin at tpg.com.au
Wed Aug 17 19:18:33 EDT 2005
Thanks, Heikki, but I am trying to parse different IO objects such as
AlignIO, SeqIO and SearchIO, but what I am trying to do is guess the
format of any IO object and then use the appropriate parser.
i.e If I have a unknown file output.out I want to guess the format and
then the appropriate IO parser to use. Is there a way to do this or
should I just test all the IO parsers with an eval block.
Regards,
Tim
On Wed, 2005-08-17 at 10:03 +0100, Heikki Lehvaslaiho wrote:
>
> Tim,
>
> Bio::Tools::GuessSeqFormat is not meant to be used directly. It is called
> automatically by the constructor (new() method) of Bio::SeqIO:
>
> my $format = $param{'-format'} ||
> $class->_guess_format( $param{-file} || $ARGV[0] );
>
> if( ! $format ) {
> if ($param{-file}) {
> $format = Bio::Tools::GuessSeqFormat->new(-file => $param{-file}||
> $ARGV[0] )->guess;
> } elsif ($param{-fh}) {
> $format = Bio::Tools::GuessSeqFormat->new(-fh => $param{-fh}||
> $ARGV[0] )->guess;
> }
> }
> # ... code removed
> return "Bio::SeqIO::$format"->new(@args);
>
> The logic from the above code is as follows:
>
> 1. _guess_format() tries to determine the format of the file based on the
> filename extension.
>
> 2. Only if that fails try looking into the file/stream to guess the format
> using the Bio::Tools::GuessSeqFormat code.
>
> 3. The returned object is not a Bio::SeqIO but a Bio::SeqIO::$format object,
> which has the correct next_seq() and write_seq() methods. You can therefore
> use ref($seqoobject) to find out what parser is being used.
>
>
>
> The standard code for doing this should contain all the automation needed:
>
> foreach my $inputfilename (@all_files) {
> my $in = Bio::SeqIO->new(-file => $inputfilename);
> while ( my $seq = $in->next_seq() ) {
> # do something
> }
> }
>
>
> Yours,
> -Heikki
>
>
> On Wednesday 17 August 2005 08:15, Tim Erwin wrote:
> > Hi,
> >
> > Is there a way to determine which parser to use based on the guess from
> > Bio::Tools::GuessSeqFormat without hard coding a hash? I am interested
> > in parsing and storing various files to a database.
> >
> > I was wondering if it is a good idea to make a some extra functions so that
> > files could be parsed automatically.
> >
> > i.e for a fasta file
> >
> > my $obj = new Bio::Tools::GuessSeqFormat( -file => $filename );
> > my $format = $obj->guess;
> > my $parser = $obj->parser; #RETURNS Bio::SeqIO
> > my $next_method = $obj->next_method; #RETURNS next_seq
> > my $write_method = $obj->write_method; #RETURNS write_seq
> >
> > #PARSE FILE
> > my $infile = new $parser(-file => $filename, -format => $format);
> > while (my $result = $infile->$next_method) {
> >
> > #DO STUFF HERE
> > #ADD $result TO DATABASE
> >
> > }
> >
> > Perhaps there is a better way to do this? Any suggestions would be great.
> >
> > Regards,
> >
> > Tim
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list