[Bioperl-l] getting upstream regions
Mark Wagner
mark at lanfear.net
Thu Mar 6 21:44:11 EST 2003
> I was wondering what the current standard way to get sequence just
> upstream of a gene was in bioperl. We're mostly using the UCSC
> dataset/tools at the moment.
>
> At the moment, we're using a Perl module which calls the "nibFrag"
> program from UCSC. If people think this would be useful, I'd be
> happy to contribute it, although I don't know bioperl's object
> system terribly well, so it would probably need some rewriting.
> (I gather there's some C code to do this, but the actual "nibFrag"
> program itself is quite fast, and this avoids making native calls,
> which is nice, although slower.)
See <http://bugzilla.bioperl.org/show_bug.cgi?id=1405> for my
naive take on it.
The efetch <http://www.ncbi.nih.gov/entrez/query/static/efetchseq_help.html>
interface to GenBank supports arbitrary subsequence retrieval with the
seq_start and seq_stop parameters. The Bio::DB::Genbank uses efetch
but does not support these parameters. I hacked them in though. See the
patches on the bugzilla page.
I didn't test this very much because I just ended up downloading the
entire genomes that I needed.
--
Mark Wagner mark at lanfear.net
More information about the Bioperl-l
mailing list