[Bioperl-l] Indexing large databases / BioSQL

Hilmar Lapp hlapp at gmx.net
Mon Apr 28 19:46:07 UTC 2008


On Apr 28, 2008, at 9:51 AM, Bánk Beszteri wrote:
>  I´m not loading Swissprot, but TrEMBL. Is swiss also the  
> appropriate format here?


Yes, though I guess it can be confusing.

Maybe we should create a symlink uniprot.pm to swiss.pm, or in fact  
fork them if UniProt starts accumulating enough differences from the  
traditional Swissprot format.

BTW as you had noticed, the --safe switch only protects the script  
from crashing due to a db loading error. A parsing error will still  
cause a crash.

I guess you can argue that that's not nice, and having a chance to  
skip over the record that offends the (BioPerl) parser would be  
useful. The problem is that if the parser errors out, it's not  
guaranteed where we are in the file and whether the parser module is  
in a state that it can recover itself from. For the database it's a  
bit easier as one just needs to rollback() the transaction (each  
sequence is its own transaction).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================







More information about the Bioperl-l mailing list