[Bioperl-l] Out of memory errors running Bio::ASN1::EntrezGene against latest Homo_sapiens.ags file

Stefan Kirov stefan.kirov at bms.com
Fri Oct 12 18:20:38 UTC 2007


Susan Wilson wrote:
> Hi,
>
> I downloaded the latest ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ 
> ASN_BINARY/Mammalia/Homo_sapiens.ags.gz and ran gene2xml on it to  
> generate Homo_sapiens.xml which is 5821420628 bytes.  I cannot parse  
> this file with Bio::ASN1::EntrezGene, even on a machine with 256GB of  
> memory.  I get a simple "Out of memory" output even with the  
> following code:
>
> #!/usr/bin/perl
> use strict;
> use Bio::ASN1::EntrezGene;
>    my $parser = Bio::ASN1::EntrezGene->new('file' =>  
> "Homo_sapiens.xml");
>    while(my $result = $parser->next_seq)
>    {
>    }
>
>
>
> Thanks.
> Susan
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>   
Susan,
Are you running the latest version of Bio::ASN1::EntrezGene? You may
have a better chance of getting a fast (and useful) answer if you
contact Mingyi Liu (see on CPAN) directly- the module is not part of
Bioperl. Just to mention- I have also seen similar problems and there
seems to be particular problematic records. I think NCBI made some
changes/additions to their format (might have to do something with the
number/structure of contigs). I will have to run my pipeline soon again
and if I run into the same problem I will probably create bug report for
Mingyi. I hope you do it before me- it is boring and long process.
Stefan




More information about the Bioperl-l mailing list