[Bioperl-l] query in BLAT usage

Sean Davis sdavis2 at mail.nih.gov
Tue Nov 29 06:35:14 EST 2005


On 11/29/05 5:35 AM, "neeti somaiya" <neetisomaiya at gmail.com> wrote:

> Hi,
> 
> I am using BLAT in a project. I have read that each blat run is typically
> done against a single chromosome, but with a large number of query
> sequences.
> I have to BLAT gene sequences against a chromosomes sequence.
> Can I have sequences of all the genes in FASTA format, one following the
> other, in a single .fa file as the query file and BLAT it against a .fa file
> as the database, which has the sequence of the chromosome in FASTA format?

Yes.  Unlike BLAST, there is a very significant startup time for blat, so
you are MUCH better off running blat with your chromosome as the database
and put all your queries in one fasta file.

> Also, running a BLAT of a 27025 bases gene, against a chromosome of size
> 57701691 bases took 70 mins to run (on a Linux machine). Is there any means
> of increasing the speed of the BLAT run?

I'm not sure, but I think something is wrong.  I can typically blat about
50,000 different sequences against all the human chromosomes in about an
hour on my Mac G5.  I don't think your linux machine is that much slower
than my machine.  When you blat your gene using the UCSC website, how long
does it take?  The server version that they run on the website is actually
slightly slower than the standalone blat (but runs all chromosomes
simultaneously and doesn't have startup time), but you should get an idea of
how fast things should be.

Sean



More information about the Bioperl-l mailing list