[Bioperl-l] getting upstream regions

Lincoln Stein lstein at cshl.org
Fri Mar 7 11:35:22 EST 2003


My way to do this is to load the data set into Bio::DB::GFF and then to do 
something like this:

	my $db = Bio::DB::GFF->new('.....');
	my @genes = $db->features('gene');
	for my $g (@genes) {
		my $upstream = $g->subseq(-299,0);
		print $g->display_name,"\t",$upstream,"\n";
	}


Assuming that you've loaded the database with features of type "gene", this 
will return everything 300 bp upstream of the gene start (positions -299 to 
position 0).


Lincoln


On Thursday 06 March 2003 03:01 pm, Josh Burdick wrote:
> I was wondering what the current standard way to get sequence just
> upstream of a gene was in bioperl.  We're mostly using the UCSC
> dataset/tools at the moment.
>
> At the moment, we're using a Perl module which calls the "nibFrag"
> program from UCSC.  If people think this would be useful, I'd be
> happy to contribute it, although I don't know bioperl's object
> system terribly well, so it would probably need some rewriting.
> (I gather there's some C code to do this, but the actual "nibFrag"
> program itself is quite fast, and this avoids making native calls,
> which is nice, although slower.)
>
> Josh

-- 
========================================================================
Lincoln D. Stein                           Cold Spring Harbor Laboratory
lstein at cshl.org			                  Cold Spring Harbor, NY
========================================================================




More information about the Bioperl-l mailing list