[Bioperl-l] reading multiple swissprot records from a single file
Jason Stajich
jason.stajich at duke.edu
Wed Jan 5 16:44:57 EST 2005
It reads a stream of data which is delimited by the '//'. It only
processes one at a time. You just keep calling next_seq until it gets
to the end of the file or filehandle. That is why we typically
construct the usage with a while loop.
For example if you wanted to make a new file which only had your
keepers in it.
my $in = Bio::SeqIO->new(-format => 'swiss', -file => 'sprot42.dat');
my $out = Bio::SeqIO->new(-format=> 'swiss', -file =>'>keepers.swiss');
while( my $seq =$in->next_seq ) {
my $keep = 0;
for my $feature ($seq->get_SeqFeatures ) {
# figure out if feature criteria is met, if so, set $keep =1;
}
if($keep) {
$out->write_seq($seq);
}
}
If you wanted to use a filehandle instead of a file just use the -fh
parameter instead of -file. See Bio::Root::IO for more information.
This might be useful if you were streaming in zcat [zcat reads gzipped
files and produces a stream of the unzipped data].
open(FH, "zcat sprot42.dat.gz |") || die("could not open file with
zcat"); # the trailing '|' is necessary to tell perl to pipe the
output
my $in = Bio::SeqIO->new(-fh => \*FH, -format=> 'swiss');
OR save the handle in a variable
my $fh;
open($fh, "zcat sprot42.dat.gz |") || die("could not open file with
zcat"); # the trailing '|' is necessary to tell perl to pipe the
output
my $in = Bio::SeqIO->new(-fh => $fh, -format=> 'swiss');
-jason
On Jan 5, 2005, at 3:48 PM, Daily, Kenneth Michael wrote:
> I'm having trouble using bioperl to parse a file with multiple
> (thousands) of swissprot records in them. Is there a way to do this
> with SeqIO and such? The way I understand it, if I use a filehandle to
> read in the data, it still is expecting only one record in the file.
> Can I use a FH to read in a record, which ends with //, then put this
> variable into a SeqIO object to manpulate it? I need to look at each
> record and decide if I want to keep it based on the features it has. I
> have a program using standard parsing techniques but want to do this
> with bioperl if possible. Thanks for any help.
>
> Kenny Daily
> IU School of Informatics
> kmdaily at indiana dot edu
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/
More information about the Bioperl-l
mailing list