[Bioperl-l] Getting sequences by base pair locations

Sean Davis sdavis2 at mail.nih.gov
Fri Jul 28 15:21:09 UTC 2006


Chris Fields wrote:

> Would be nice to have a more automated and direct way of doing something
> along these lines within bioperl (with the obvious caveat of not spamming
> the server).  You can currently retrieve chunks of sequence based on start,
> stop, strand from GenBank.

The ENSembl API has some features that can be useful for these types of 
things.

I, personally, have a mirror of the UCSC mysql database (very easy to do 
with just rsync and mysql) and try to turn questions like these into SQL 
queries.  That, combined with Bio::DB::Fasta, can make a useful 
automated pipeline for getting arbitrary sequences associated with 
genomic locations meeting specific criteria.  It is much faster than 
anything one can do over the web and doesn't have access limitations.

Sean



More information about the Bioperl-l mailing list