[Bioperl-l] UCSC database backend

Sendu Bala bix at sendu.me.uk
Thu Aug 10 07:14:03 UTC 2006


Sean Davis wrote:
> 
> Before we get too far down this line of thought, keep in mind that this will
> be dozens of Gb of sequence and database tables.  See here for details:
> 
> http://genome.ucsc.edu/admin/mirror.html
> 
> The sequences include all of genbank, essentially.  The mysql tables ALONE
> (no sequence) for only ONE human assembly is on the order of 10Gb--not the
> kind of thing you can download in a few minutes (or even hours).  Just to
> keep in mind....

I think if someone needs heavy-duty access to genomic data, they'll find 
the discspace. That wouldn't be the problem. The problem would be 
finding an easy way of getting the data, which is where I hoped 
something like a UCSC frontend would come in.


> On another point, the strength of UCSC is not in obtaining sequence, but in
> mapping to the genome.  I think getting actual sequence should be secondary
> here, if for no other reason than there are trivially easy ways of getting
> sequence information from elsewhere given an accession or ID.  There is
> simply too much information to be stored locally for most people and getting
> the data remotely from UCSC doesn't seem possible currently.

The work would certainly be highly valuable even if it didn't allow for 
sequence retrieval, but from my own point of view my main interest was 
exactly the retrieval of arbitrary bits of genomic sequence - for which 
there is no accession or ID that can be used to query some other database.

How does the website table browser frontend allow retrieval of sequence 
data?



More information about the Bioperl-l mailing list