[Bioperl-l] Out of memory errors running Bio::ASN1::EntrezGeneagainst latest Homo_sapiens.ags file
Susan Wilson
smwilson at hpc.unm.edu
Mon Oct 15 15:08:55 UTC 2007
Mingyi,
Thank you very much for your advice. The text ASN file 1/4 the size
of the (evil, evil) XML file and parsing it ran just fine. We are
still pursuing a 64-bit perl on our 256GB server and I will let you
know how it works.
Thanks.
Susan
On Oct 12, 2007, at 1:06 PM, Mingyi Liu wrote:
> BTW, here's the syntax in one of my messages last year about how to
> convert the compressed binary ASN format NCBI provides to the text
> ASN format my module (or Stefan's SeqIO::entrezgene) expects (the -
> x switch does the trick, overwriting the default option to produce
> XML output):
>
> my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i
> Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the
> gzipped binary file directly downloaded from NCBI
>
> Same syntax should be used when you're using SeqIO (thus
> SeqIO::entrezgene).
>
> BTW, text ASN is both smaller and faster to parse than XML format.
>
> Best,
>
> Mingyi
More information about the Bioperl-l
mailing list