[Bioperl-l] Indexing large databases / BioSQL
Hilmar Lapp
hlapp at gmx.net
Mon Apr 28 19:46:07 UTC 2008
On Apr 28, 2008, at 9:51 AM, Bánk Beszteri wrote:
> I´m not loading Swissprot, but TrEMBL. Is swiss also the
> appropriate format here?
Yes, though I guess it can be confusing.
Maybe we should create a symlink uniprot.pm to swiss.pm, or in fact
fork them if UniProt starts accumulating enough differences from the
traditional Swissprot format.
BTW as you had noticed, the --safe switch only protects the script
from crashing due to a db loading error. A parsing error will still
cause a crash.
I guess you can argue that that's not nice, and having a chance to
skip over the record that offends the (BioPerl) parser would be
useful. The problem is that if the parser errors out, it's not
guaranteed where we are in the file and whether the parser module is
in a state that it can recover itself from. For the database it's a
bit easier as one just needs to rollback() the transaction (each
sequence is its own transaction).
-hilmar
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the Bioperl-l
mailing list