[BioSQL-l] RE: Multiple accession numbers?

Hilmar Lapp hlapp at gnf.org
Wed May 7 08:46:29 EDT 2003


They *should* be in bioentry_qualifier_value -- if not it sounds like a bug.

Your suggestion makes me hope that you haven't looked there yet ...

If you retrieve a sequence you also should get them back. Unfortunately the secondary accessions thing is not tested for yet, because in bioperl 1.2.x they aren't stored in (and taken from) the annotation bag.

So, for loading and for retrieval you do run the latest bioperl-db and bioperl main trunk (*not* 1.2.1)?

   -hilmar

-----Original Message-----
From:	Elia Stupka [mailto:elia at tll.org.sg]
Sent:	Wed 5/7/2003 3:27 AM
To:	Hilmar Lapp
Cc:	biosql-l at open-bio.org; Juguang Xiao
Subject:	Multiple accession numbers?
Hi Hilmar,

we are trying to build a simple web sequence retrieval system on top of 
BioSQL now that we have most public sequences loaded well, and I bumped 
into a problem, maybe you've seen this before... some swissprot records 
have many accession numbers, which are correctly parsed by SeqIO into 
an array of accession numbers which eventually end up in a 
Bio::Seq::RichSeq object as Bio::Annotation::SimpleValue objects with 
tag "secondary_accession" ana value the accession number.

However I can't seem to find them stored in BioSQL. The 
secondary_accession key is stored in term correctly, with its term_id, 
and that term_id is found in the seqfeature table as the "type_term_id" 
always associated with the term_id for the "gene_name" key, but no 
value to be found, not sure if I am looking in the wrong place, but 
anyway the result is that so far I can't retrieve sequences by their 
secondary_accession numbers...

Regardless of the solution to that, don't you think that secondary 
accessions should get some sort of preferential treatment? In other 
words perhaps be associated directly to the bioentry via 
bioentry_qualifier_value? One would want to quickly search by all 
accession numbers, without having to issue a slow select statement over 
feature-related tables, right?

Let me know what you think...

Elia


---
Bioinformatics Program Manager
Temasek Life Sciences Laboratory
1, Research Link
Singapore 117604
Tel. +65 6874 4945
Fax. +65 6872 7007







More information about the BioSQL-l mailing list