[Bioperl-l] Memory requirements for conversion from embl to genbank

Sendu Bala bix at sendu.me.uk
Thu Aug 31 15:50:45 UTC 2006


Martin MOKREJŠ wrote:
> I observe the same. Testcase here. Please push it into tescases.
> It will be helpful in the future when the parser should cope with the
> two /note feature lines.

Well the cause of the hang is the multiple species defined for one 
sequence. Is that valid? Desired? Should the fix be to somehow store and 
be able to output multiple species again, or to ignore all but one of 
the species? You have two sequences with this problem in the large file 
originally posted.

If this has 'worked' for you before it is probably because a completely 
meaningless composite species classification was generated. The new 
taxonomy system 'ensures' that the taxonomic data parsed is sane enough 
to be output correctly again.



More information about the Bioperl-l mailing list