[Bioperl-l] Command-line Psiblast using NCBI blastpgp

Aaron J. Mackey amackey at pcbi.upenn.edu
Tue Feb 24 15:01:23 EST 2004


You must include at least one sequence from the MSA as a query; this 
sequence defines the "columns" of the PSSM (i.e. any columns in the MSA 
that include gaps in this sequence will not be apart of the final 
PSSM).  blastpgp reads the MSA and builds a PSSM after determining the 
relative uniqueness of each sequence in the profile, and weighting the 
contribution of each sequence to the PSSM by its uniqueness (imagine 
the extreme: an MSA that consisted of the same protein repeated 10 
times; searching with this MSA would be no different than searching 
with the single protein).

How many sequences are in your MSA?  If less than 10, you won't see 
very much change between using the PSSM and just the query sequence 
alone.  If you have 50, but they're all practically the same 
(redundant) sequence, you'll also see little change in the results.

To sum up: don't be so suspicious, I expect it's working as well as it 
can, given your input sequences.

-Aaron

On Feb 23, 2004, at 10:37 AM, Xiang Deng wrote:

> Hi Everyboday,
>
> I got a question about how to do psiblast using NCBI blastpgp. The 
> thing I
> want to do is to use a PSSM generated from a multiple alignment of our
> internal data to blast against NCBI nr database. I followed the
> instruction from blast tutorial as follows,
>
> blastpgp -i seq1.txt -B align.msf -e 5000 -F F -j 2 -v 10 -d nr -o
> test_out.txt -C pssm.txt
>
> I do not know why I have to specify a single sequence in seq1.txt from 
> the
> aligned sequences in align.msf. I want to use the pssm created from the
> multiple alignment in align.msf to blast instead of only one sequence. 
> And
> the result looks like using the single sequence only for blast and I 
> could
> not see any sign of using the PSSP calculated from the multiple 
> alignment.
> I am concerned about that result, does anyone have the same experience 
> and
> know what is going on there? whether or not the command-line above did
> exactly what I want and Iam just too suspicious?
>
> And anyone has a better way to do this kind of psiblast via 
> command-line?
>
> thanks a lot,
>
> Xiang
>
> Department of Pharmacology and Cancer Biology
> Duke University Medical Center
> Durham, NC 27710
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list