[Bioperl-l] recovering blast query_name
Wiepert, Mathieu
Wiepert.Mathieu@mayo.edu
Wed, 20 Nov 2002 15:51:24 -0600
Hi,
I made a few assumptions with the previous answer, sorry. You need bioperl-live to get that to work, I don't think it is in the 1.02 distro.
Additionally, I only tested with fasta files, I assume that anything else will still work, as long as the sequence has a description. The query name is built up like
$header{'QUERY'} = ">".(defined $seq->display_id() ? $seq->display_id() : "").
" ".(defined $seq->desc() ? $seq->desc() : "")."\n".$seq->seq();
so, the sequences have to have a display id and description to get a query name?
My previous example was only slightly off, I left out the description.
>U20499_EXON_1A 2848-2960 of U20499
acactggaccttcaaaaccctcagggcagagagcagccctacactccctacaccacaccc
atactcagcccctgcaggcaaggagagaacaggtcaggttcccgagagctcag
results in query name of
U20499_EXON_1A 2848-2960 of U20499
parsed from the header of this blast result (saved from the remote blast)
BLASTN 2.2.4 [Aug-26-2002]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
RID: 1033569396-029169-20578
Query= U20499_EXON_1A 2848-2960 of U20499
(113 letters)
Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS,
or phase 0, 1 or 2 HTGS sequences)
1,406,693 sequences; 6,799,009,920 total letters
Check the actual blast results, and make sure that has the query name in it, if it doesn't, then we have a problem...
Here is the more current documentation
http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html
-Mat
> -----Original Message-----
> From: Lewis Lukens [mailto:llukens@uoguelph.ca]
> Sent: Wednesday, November 20, 2002 2:49 PM
> To: bioperl-l@bioperl.org
> Subject: [Bioperl-l] recovering blast query_name
>
>
> Hello,
>
> Sorry for a basic question... I have been trying to use the
> Bio::Tools:Run:RemoteBlast module to blast a single file with many
> fasta formated sequences against ncbi nt and parse the blast reports.
> Almost everything is working well. I get all the hit and hsp
> features for all the hits. I can recover the query sequence, but I
> can't seem to recover the query sequence names. How does one do this?
>
> I used almost the exact code as in the Remoteblast Synopsis
> http://doc.bioperl.org/releases/bioperl-1.0.2/Bio/Tools/Run/Re
> moteBlast.html
>
> in this code, this expression works:
> print "db is ", $result->database_name(), "\n";
>
> but, these expressions return empty fields:
> my $name = $result->query_name();
> my $desc = $result->query_description();
> my $acc= $result->query_accession();
>
> I have been using SearchIO to parse blast output files and never had
> this problem before. Any ideas?
>
> Thanks much,
> Lewis
> --
> Lewis Lukens
> Assistant Professor
> Department of Plant Agriculture
> Univ. of Guelph, Guelph, Ontario. N1G 2W1
>
> Tel: (519) 824- 4120 ext 2304
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>