[Bioperl-l] recovering blast query_name
Wiepert, Mathieu
Wiepert.Mathieu@mayo.edu
Wed, 20 Nov 2002 15:57:14 -0600
Sorry for all the posts :-| latest RemoteBlast has moved to bioperl-run, not bioperl-live...
http://doc.bioperl.org/bioperl-run/
-Mat
Mathieu Wiepert
Medical Informatics Research
Mayo Foundation
(507) 266-2317 Fax (507)-284-0360
wiepert.mathieu@mayo.edu
> -----Original Message-----
> From: Wiepert, Mathieu
> Sent: Wednesday, November 20, 2002 3:51 PM
> To: 'Lewis Lukens'; bioperl-l@bioperl.org
> Subject: RE: [Bioperl-l] recovering blast query_name
>
>
> Hi,
>
> I made a few assumptions with the previous answer, sorry. You
> need bioperl-live to get that to work, I don't think it is in
> the 1.02 distro.
>
> Additionally, I only tested with fasta files, I assume that
> anything else will still work, as long as the sequence has a
> description. The query name is built up like
>
> $header{'QUERY'} = ">".(defined $seq->display_id() ?
> $seq->display_id() : "").
> " ".(defined $seq->desc() ? $seq->desc() :
> "")."\n".$seq->seq();
>
> so, the sequences have to have a display id and description
> to get a query name?
>
>
> My previous example was only slightly off, I left out the
> description.
>
> >U20499_EXON_1A 2848-2960 of U20499
> acactggaccttcaaaaccctcagggcagagagcagccctacactccctacaccacaccc
> atactcagcccctgcaggcaaggagagaacaggtcaggttcccgagagctcag
>
> results in query name of
> U20499_EXON_1A 2848-2960 of U20499
>
> parsed from the header of this blast result (saved from the
> remote blast)
>
> BLASTN 2.2.4 [Aug-26-2002]
>
>
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro
> A. Schaffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
> "Gapped BLAST and PSI-BLAST: a new generation of protein
> database search
> programs", Nucleic Acids Res. 25:3389-3402.
> RID: 1033569396-029169-20578
> Query= U20499_EXON_1A 2848-2960 of U20499
> (113 letters)
>
> Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS,
> or phase 0, 1 or 2 HTGS sequences)
> 1,406,693 sequences; 6,799,009,920 total letters
>
> Check the actual blast results, and make sure that has the
> query name in it, if it doesn't, then we have a problem...
>
> Here is the more current documentation
> http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html
>
> -Mat
>
>
> > -----Original Message-----
> > From: Lewis Lukens [mailto:llukens@uoguelph.ca]
> > Sent: Wednesday, November 20, 2002 2:49 PM
> > To: bioperl-l@bioperl.org
> > Subject: [Bioperl-l] recovering blast query_name
> >
> >
> > Hello,
> >
> > Sorry for a basic question... I have been trying to use the
> > Bio::Tools:Run:RemoteBlast module to blast a single file with many
> > fasta formated sequences against ncbi nt and parse the
> blast reports.
> > Almost everything is working well. I get all the hit and hsp
> > features for all the hits. I can recover the query sequence, but I
> > can't seem to recover the query sequence names. How does
> one do this?
> >
> > I used almost the exact code as in the Remoteblast Synopsis
> > http://doc.bioperl.org/releases/bioperl-1.0.2/Bio/Tools/Run/Re
> > moteBlast.html
> >
> > in this code, this expression works:
> > print "db is ", $result->database_name(), "\n";
> >
> > but, these expressions return empty fields:
> > my $name = $result->query_name();
> > my $desc = $result->query_description();
> > my $acc= $result->query_accession();
> >
> > I have been using SearchIO to parse blast output files and
> never had
> > this problem before. Any ideas?
> >
> > Thanks much,
> > Lewis
> > --
> > Lewis Lukens
> > Assistant Professor
> > Department of Plant Agriculture
> > Univ. of Guelph, Guelph, Ontario. N1G 2W1
> >
> > Tel: (519) 824- 4120 ext 2304
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
>