[Bioperl-l] Out of memory errors running Bio::ASN1::EntrezGene against latest Homo_sapiens.ags file
Stefan Kirov
stefan.kirov at bms.com
Fri Oct 12 18:20:38 UTC 2007
Susan Wilson wrote:
> Hi,
>
> I downloaded the latest ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
> ASN_BINARY/Mammalia/Homo_sapiens.ags.gz and ran gene2xml on it to
> generate Homo_sapiens.xml which is 5821420628 bytes. I cannot parse
> this file with Bio::ASN1::EntrezGene, even on a machine with 256GB of
> memory. I get a simple "Out of memory" output even with the
> following code:
>
> #!/usr/bin/perl
> use strict;
> use Bio::ASN1::EntrezGene;
> my $parser = Bio::ASN1::EntrezGene->new('file' =>
> "Homo_sapiens.xml");
> while(my $result = $parser->next_seq)
> {
> }
>
>
>
> Thanks.
> Susan
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
Susan,
Are you running the latest version of Bio::ASN1::EntrezGene? You may
have a better chance of getting a fast (and useful) answer if you
contact Mingyi Liu (see on CPAN) directly- the module is not part of
Bioperl. Just to mention- I have also seen similar problems and there
seems to be particular problematic records. I think NCBI made some
changes/additions to their format (might have to do something with the
number/structure of contigs). I will have to run my pipeline soon again
and if I run into the same problem I will probably create bug report for
Mingyi. I hope you do it before me- it is boring and long process.
Stefan
More information about the Bioperl-l
mailing list