[Bioperl-l] Getting sequences by base pair locations
Cook, Malcolm
MEC at stowers-institute.org
Fri Jul 28 16:44:43 UTC 2006
There are many options.
But, it looks like you only have start end coordinates! Where do you
know which chromosome/contig the hit was on?
Assuming you have this, if you did the blat with a local copy of the
blat program and a the genome, then in addition to the blat command, you
have the twoBitToFa command which can extract the hits from the blat
index (see http://genome.ucsc.edu/goldenPath/help/blatSpec.html
<http://genome.ucsc.edu/goldenPath/help/blatSpec.html> )
Or did you do the blat at ucsc?
Malcolm Cook
Database Applications Manager, Bioinformatics
Stowers Institute for Medical Research
oh - I replied similarly in the Bio BB forum, but it is held for
moderation so am replying here as well
________________________________
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Yuval Itan
Sent: Friday, July 28, 2006 7:08 AM
To: bioperl-l at lists.open-bio.org
Subject: [Bioperl-l] Getting sequences by base pair locations
Hello all,
I was BLATing a few hundred human genes against the chimp
genome, and kept the best chimp hits for every human gene.
I have the base pair start and end location for every chimp hit,
and I need to get the sequence for each of these chimp hits. Here is an
example for a few chimp hits bp locations:
Start End
142854 144504
154479 155198
153066 167370
163146 163559
I have one chimp genome file (about 3GB) including all
chromosomes, but I could also get one file per chromosome if that would
make things easier. Does anyone have a script or a link for an interface
that can do the job?
Thank you very much.
More information about the Bioperl-l
mailing list