[Bioperl-l] Problem with Bio::SeqIO opening gzipped files
Brian Osborne
brian_osborne at cognia.com
Fri Nov 14 11:43:54 EST 2003
Jason,
This is odd because the SeqIO HOWTO says you can do the trick that Zayed is
trying. From the HOWTO:
use Bio::SeqIO;
# get command-line arguments, or die with a usage statement
my $usage = "gzip2fasta.pl infile informat outfile\n";
my $infile = shift or die $usage;
my $informat = shift or die $usage;
my $outformat = shift or die $usage;
# create one SeqIO object to read in, and another to write out
my $seqin = Bio::SeqIO->new('-file' => "/usr/local/bin/gunzip $infile
|",
'-format' => $informat);
my $seqout = Bio::SeqIO->new('-file' => ">$outfile",
'-format' => 'Fasta');
# write each entry in the input to the output file
while (my $inseq = $seqin->next_seq) {
$outseq->write_seq($inseq);
}
exit;
I should correct the HOWTO?
Brian O.
-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Jason Stajich
Sent: Friday, November 14, 2003 11:15 AM
To: Zayed Albertyn
Cc: bioperl-l at bioperl.org; Andreas Kahari
Subject: Re: [Bioperl-l] Problem with Bio::SeqIO opening gzipped files
When you pass in -file there is an implicit assumption that it is a
filename you are passing in, NOT a stream.
If you want to make this work, do this (you can replace 'zcat' with
'gunzip -c' if you prefer )
open($fh, "zcat $filename.gz |");
my $seqio = new Bio::SeqIO(-fh => $fh, -format => 'genbank');
You can also provide multiple files in that zcat
open($fh, "zcat $file1 $file2 ... |");
-jason
On Fri, 14 Nov 2003, Zayed Albertyn wrote:
> Hi Andreas
>
> Adding the -c switch still doesnt work. I still get the same error
> message. Input is the full path to the file e.g.
>
> /cip0/db/GENBANK/RELEASE137/gbest13.seq.gz == $path/$file
>
> I've written another script that does the normal
> open(FILE,"/bin/gunzip -c file1 |")
>
> and it works fine
>
> Z
>
> >
> > my $seq_in = Bio::SeqIO::new(
> > '-file' => "/bin/gunzip -c $path/$file|",
> > '-format' => 'genbank'
> > );
> >
> >
> >
> > --
> > |()()| Andreas Kähäri |(==)|
> > |)()(| EMBL, European Bioinformatics Institute |=)(=|
> > |()()| Wellcome Trust Genome Campus, Hinxton |(==)|
> > |)()(| Cambridge, CB10 1SD |=)(=|
> > |()()| United Kingdom |(==)|
> >
>
> -----------------------------------------------
> From: Zayed Albertyn
> Electric Genetics PTY Ltd
> Tel: +27 21 959 3645; Mobile: +2782 480 6097
> www.egenetics.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list