[Bioperl-l] Asking for advice on full EMBL extraction

Chris Fields cjfields at illinois.edu
Thu May 7 12:07:54 UTC 2009


I noticed that Russell has 16GB RAM on his setup.  Was yours equivalent?

chris

On May 7, 2009, at 12:32 AM, brian li wrote:

> Thank you very much for your offer.
>
> The director of our lab wants me to do the extraction every time a new
> release of EMBL is published. I can't push the task to you every time.
>
> I can offer more information of the server I run my script on if  
> needed.
>
> -Brian
>
> On Thu, May 7, 2009 at 1:01 PM, Smithies, Russell
> <Russell.Smithies at agresearch.co.nz> wrote:
>> Sadly, that's the same code as I ran but I had a Data::Dump in the  
>> middle.
>> Versions of Perl and BioPerl are the same.
>> We're running RHEL 5 (kernel 2.6.18-92.1.18.el5) with 16GB RAM
>>
>> If you get a full script running on a smaller dataset, I could  
>> probably run it on the bigger stuff and give you back tab-separated  
>> (or is that tab\tseparated ?) data for loading into your db.
>>
>> --Russell
>>
>>> -----Original Message-----
>>> From: brian li [mailto:brianli.cas at gmail.com]
>>> Sent: Thursday, 7 May 2009 4:50 p.m.
>>> To: Smithies, Russell
>>> Cc: bioperl-l at lists.open-bio.org
>>> Subject: Re: [Bioperl-l] Asking for advice on full EMBL extraction
>>>
>>> Dear Russell,
>>>
>>> My example code is as following. I omit the parse process and these
>>> lines give me "Segmentation Fault" too.
>>>
>>> # Start of code
>>> my $seqio = Bio::SeqIO->new(-file => 'rel_ann_mus_01_r99.dat',
>>>                                              -format => 'EMBL');
>>> my $index = 1;
>>> while (my $seq = $seqio->next_seq)
>>> {
>>>     print "Dealing with entry: $index\n";
>>>     $index++;
>>> }
>>> # End
>>>
>>> The platform I run this code on:
>>> BioPerl 1.6.0
>>> Perl 5.8.8
>>> Ubuntu 8.04 LTS Server 64-bit version (Linux 2.6.24-23-server)
>>>
>>> I have monitored the memory usage when I run the code above. There  
>>> is
>>> always around 20GB free memory (buffer size counted in) left. So I
>>> suppose the segfault can't be explained just by memory shortage.
>>>
>>> Brian
>>>
>>>
>>> On Thu, May 7, 2009 at 11:32 AM, Smithies, Russell
>>> <Russell.Smithies at agresearch.co.nz> wrote:
>>>> Hi Brian,
>>>> I hate to say it but it worked OK for me using  
>>>> rel_ann_mus_01_r99.dat.gz and
>>> simple example Bio::SeqIO code from bugzilla
>>>> It's not using more than 1GB memory on our server and doesn't  
>>>> segfault.
>>>>
>>>> Send me your example code and I'll give it a go if you like.
>>>>
>>>>
>>>> Russell Smithies
>>>>
>>>> Bioinformatics Applications Developer
>>>> T +64 3 489 9085
>>>> E  russell.smithies at agresearch.co.nz
>>>>
>>>> Invermay  Research Centre
>>>> Puddle Alley,
>>>> Mosgiel,
>>>> New Zealand
>>>> T  +64 3 489 3809
>>>> F  +64 3 489 9174
>>>> www.agresearch.co.nz
>>>>
>>>>
>> = 
>> = 
>> =====================================================================
>> Attention: The information contained in this message and/or  
>> attachments
>> from AgResearch Limited is intended only for the persons or entities
>> to which it is addressed and may contain confidential and/or  
>> privileged
>> material. Any review, retransmission, dissemination or other use  
>> of, or
>> taking of any action in reliance upon, this information by persons or
>> entities other than the intended recipients is prohibited by  
>> AgResearch
>> Limited. If you have received this message in error, please notify  
>> the
>> sender immediately.
>> = 
>> = 
>> =====================================================================
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list