[BioPython] Concerns the update of BioSQL.taxon table
Eric Gibert
ericgibert at yahoo.fr
Tue Mar 25 14:54:57 UTC 2008
Dear all,
I moved to BioPython 1.45 and created a fresh BioSQL 1.0.0 database. Everything went smoothly except for one important point: the table 'taxon' defines the column ncbi_taxon_id with a unique index.
But currently, when a BioSeq is created, the lineage records are all inserted as found in the GenBank data.
At insertion, there is no problem since insertion set NULL for all ncbi_taxon_id but for the species one, no duplicate keys are found. On the other hand, when I run my script to update the 'taxon' table, some ranks are the same (like family or class or order). I obtain then a 'duplicate entry on key 2' SQL error.
Before I did not have the problem because I did not have the ncbi_taxon_id associated to a UNIQUE index. Is this new in BioSQL 1.0.0?
If the answer is YES then I guess that the reason behind is to avoid to repeat all ranks for each species but to define them once only. I understand that solution but then our BioPython INSERT of a new BioSeq is "incompatible" with this behavior.
Thus I wonder if we should:
a) remove the UNIQUE index on ncbi_taxon_id
or
b) rewrite the management of the 'taxon' table in BioPython with a control that we add records only for new rank, with a 'clever' parent linkage (then what about the right and left value fields?).
Please let me know,
Eric
_____________________________________________________________________________
Envoyez avec Yahoo! Mail. Capacité de stockage illimitée pour vos emails. http://mail.yahoo.fr
More information about the Biopython
mailing list