[Bioperl-l] Fetching genomic sequences based on HUGO names or GeneIDs
Brian Osborne
osborne1 at optonline.net
Sun Feb 12 16:37:39 UTC 2006
Harry,
Hope you're doing well. The approach could be based on Bio::DB::Fasta. So,
from its documentation:
use Bio::DB::Fasta;
# create database from directory of fasta files
my $db = Bio::DB::Fasta->new('/path/to/fasta/files');
# simple access (for those without Bioperl)
my $seq = $db->seq('CHROMOSOME_I',4_000_000 => 4_100_000);
my $revseq = $db->seq('CHROMOSOME_I',4_100_000 => 4_000_000);
my @ids = $db->ids;
my $length = $db->length('CHROMOSOME_I');
my $alphabet = $db->alphabet('CHROMOSOME_I');
my $header = $db->header('CHROMOSOME_I');
# Bioperl-style access
my $db = Bio::DB::Fasta->new('/path/to/fasta/files');
my $obj = $db->get_Seq_by_id('CHROMOSOME_I');
my $seq = $obj->seq;
my $subseq = $obj->subseq(4_000_000 => 4_100_000);
Do you already have the offsets?
Brian O.
On 2/12/06 1:46 AM, "Harry Mangalam" <hjm at tacgi.com> wrote:
> Hi All,
>
> After perusing the tutorial and other docs for a an evening, I still can't
> find the answer to this. Forgive me if I've missed something obvious.
>
> This should not be a novel request, but I've not found it answered. If
> bioperl isn't the best way to do this, I'd be grateful to a pointer to a
> better way, especially if it includes an illuminating bit of code.
>
> The problem is to retrieve genomic sequences plus & minus some offset from a
> locus determined by HUGO keyword or GeneID. This would be a common followup
> chore for some extra analysis from a gene expression expt. Or maybe this is
> in the DBFetch routines, but I've missed the sequence type to specify...?
>
>
> TIA!
More information about the Bioperl-l
mailing list