[BioSQL-l] RE: Multiple accession numbers?
Hilmar Lapp
hlapp at gnf.org
Wed May 7 08:57:31 EDT 2003
Actually reading it over I'm confused. The term 'secondary_accession' ended up as the type of a feature?? Which feature? And how does the 'gene_name' term get in there?
If you're convinced things get stored wrongly in the database, can you isolate the databank entry, and then load it with --debug and capture the output in a file and send that to me?
-hilmar
-----Original Message-----
From: Elia Stupka [mailto:elia at tll.org.sg]
Sent: Wed 5/7/2003 3:27 AM
To: Hilmar Lapp
Cc: biosql-l at open-bio.org; Juguang Xiao
Subject: Multiple accession numbers?
Hi Hilmar,
we are trying to build a simple web sequence retrieval system on top of
BioSQL now that we have most public sequences loaded well, and I bumped
into a problem, maybe you've seen this before... some swissprot records
have many accession numbers, which are correctly parsed by SeqIO into
an array of accession numbers which eventually end up in a
Bio::Seq::RichSeq object as Bio::Annotation::SimpleValue objects with
tag "secondary_accession" ana value the accession number.
However I can't seem to find them stored in BioSQL. The
secondary_accession key is stored in term correctly, with its term_id,
and that term_id is found in the seqfeature table as the "type_term_id"
always associated with the term_id for the "gene_name" key, but no
value to be found, not sure if I am looking in the wrong place, but
anyway the result is that so far I can't retrieve sequences by their
secondary_accession numbers...
Regardless of the solution to that, don't you think that secondary
accessions should get some sort of preferential treatment? In other
words perhaps be associated directly to the bioentry via
bioentry_qualifier_value? One would want to quickly search by all
accession numbers, without having to issue a slow select statement over
feature-related tables, right?
Let me know what you think...
Elia
---
Bioinformatics Program Manager
Temasek Life Sciences Laboratory
1, Research Link
Singapore 117604
Tel. +65 6874 4945
Fax. +65 6872 7007
More information about the BioSQL-l
mailing list