[Bioperl-l] getting upstream regions
Lincoln Stein
lstein at cshl.org
Fri Mar 7 11:35:22 EST 2003
My way to do this is to load the data set into Bio::DB::GFF and then to do
something like this:
my $db = Bio::DB::GFF->new('.....');
my @genes = $db->features('gene');
for my $g (@genes) {
my $upstream = $g->subseq(-299,0);
print $g->display_name,"\t",$upstream,"\n";
}
Assuming that you've loaded the database with features of type "gene", this
will return everything 300 bp upstream of the gene start (positions -299 to
position 0).
Lincoln
On Thursday 06 March 2003 03:01 pm, Josh Burdick wrote:
> I was wondering what the current standard way to get sequence just
> upstream of a gene was in bioperl. We're mostly using the UCSC
> dataset/tools at the moment.
>
> At the moment, we're using a Perl module which calls the "nibFrag"
> program from UCSC. If people think this would be useful, I'd be
> happy to contribute it, although I don't know bioperl's object
> system terribly well, so it would probably need some rewriting.
> (I gather there's some C code to do this, but the actual "nibFrag"
> program itself is quite fast, and this avoids making native calls,
> which is nice, although slower.)
>
> Josh
--
========================================================================
Lincoln D. Stein Cold Spring Harbor Laboratory
lstein at cshl.org Cold Spring Harbor, NY
========================================================================
More information about the Bioperl-l
mailing list