[Bioperl-l] Zombie processes with GenBank get_Seq_by_acc()

Chris Fields cjfields at illinois.edu
Thu May 12 14:27:43 UTC 2011


All,

Sorry for coming to the thread late; this has been reported as a bug:

https://redmine.open-bio.org/issues/3200

I don't think the PIDs for the child processes are stored.  Truthfully, I actually think running requests as a forked process 'under the hood' isn't a good idea even if it speeds things up for this very reason (kind of violates the least surprise rule).  The forking should be done at a higher level by the user.

That being said, one should not be hammering NCBI with tons of requests anyway (you will be blocked).  This is mentioned explicitly in the POD.

chris

On May 12, 2011, at 8:56 AM, O'car Campos wrote:

> Dave:
> 
>       Thanks for checking the code, I tried with what you said, adding a
> "my" to line 18 but I still get the zombies. I was exaggerating with the
> 8000 genbank codes, also I didn't know about those other tools I will check
> them, thanks for the tip. So I'm still in a zombieland.
> 
> Cheers.
> 
> O'car
> 
> 
> On 12 May 2011 03:22, Dave Messina <David.Messina at sbc.su.se> wrote:
> 
>> Thanks for posting the code, O'car.
>> 
>> I haven't tried running it, but one thing that occurs to me is that on line
>> 18 when you create your Bio::DB::Genbank object, there's no 'my', so those
>> objects may be hanging around longer than you expect. The zombies may be
>> those objects' forked processes for connecting to Genbank. Similar to what
>> Kevin said earlier.
>> 
>> But that's all speculation.
>> 
>> The other thing I'll say as a general comment is that fetching thousands of
>> records from Genbank this way (or really fetching any more than 100) is
>> inefficient and probably slow also.
>> 
>> Instead you might try using Genbank's own fetching tools, EUtilities,
>> either directly or via the two BioPerl interfaces to them
>> (Bio::DB::EUtilities and Bio::DB::SoapEUtilities).
>> 
>> 
>> Dave
>> 
>> 
>> 
>> 
>> On Thu, May 12, 2011 at 00:16, O'car Johann Campos <ocarnorsk138 at gmail.com
>>> wrote:
>> 
>>> Kevin Brown <Kevin.M.Brown <at> asu.edu> writes:
>>> 
>>>> 
>>>> Seeing your code might help. They could just be forked children waiting
>>>> for the script to exit before they go away or something else forked them
>>>> and failed to clean up before quitting.
>>>> 
>>>>> -----Original Message-----
>>>>> From: bioperl-l-bounces <at> lists.open-bio.org [mailto:bioperl-l-
>>>>> bounces <at> lists.open-bio.org] On Behalf Of Belaid MOA
>>>>> Sent: Tuesday, May 10, 2011 1:41 PM
>>>>> To: bioperl-l <at> lists.open-bio.org
>>>>> Subject: [Bioperl-l] Zombie processes with GenBank get_Seq_by_acc()
>>>>> 
>>>>> 
>>>>> Dear All,
>>>>>  I installed the latest version of BioPerl and I ran a very simple
>>>>> code: it goes through each line (an ACC) in a file and uses GenBank to
>>>>> get the sequence
>>>>> via get_Seq_by_acc(). A look at ps shows that there were a lot of
>>>>> zombie processes (with <defunct> attribute) created. The list grows
>>>>> with the time.
>>>>> This means that Bio:DB:GenBank is forking and not cleaning the
>>>>> children. Is there any way to overcome the issue? Moreover, is there
>>>>> any way
>>>>> to specify the number of forked processes?
>>>>> 
>>>>> With best regards,
>>>>> -Belaid.
>>>>> 
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l <at> lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>> 
>>> Kevin, Belaid, All:
>>> 
>>>       Recently I've been working with genbank too and ran a code to get
>>> Genbank info from accession numbers, I also noticed the weird behavior and
>>> the
>>> zombie processes that are in the background, altough the code works and I
>>> get
>>> the info I need there are a lot of zombie processes in the background and
>>> for
>>> example running this task with 8000 accession numbers would be a pain
>>> where you
>>> all know. I'm not a bioperl expert and I may be missing some piece of code
>>> to
>>> quit the forked children as may be happening to belaid, so this is my
>>> piece of
>>> code in case any get and idea why is this happening.
>>> 
>>> http://pastebin.com/Zq88cpwb
>>> 
>>> Thanks in advance.
>>> Cheers.
>>> 
>>> O'car.
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>> 
>> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list