[Bioperl-l] Taxa Id from blast report

Smithies, Russell Russell.Smithies at agresearch.co.nz
Tue Apr 23 22:32:25 UTC 2013


It works OK if I do it with NCBI's pre-formatted databases, eg.

illustrious$ blastx -query gold_small.fa -db /bifo/infernal/active/blastdata/mirror/nr  -max_target_seqs 1 -outfmt "6 staxids sscinames sskingdoms"
411903  Collinsella aerofaciens ATCC 25986      Bacteria
411903  Collinsella aerofaciens ATCC 25986      Bacteria
39947   Oryza sativa Japonica Group     Eukaryota
39947   Oryza sativa Japonica Group     Eukaryota
39947   Oryza sativa Japonica Group     Eukaryota
498761  Heliobacterium modesticaldum Ice1       Bacteria
391296  Streptococcus suis 98HAH33      Bacteria
391296  Streptococcus suis 98HAH33      Bacteria

Perhaps it's something to do with your database formatting or sequence IDs?

--Russell


From: shalu sharma [mailto:sharmashalu.bio at gmail.com]
Sent: Wednesday, 24 April 2013 5:14 a.m.
To: Jason Stajich
Cc: Smithies, Russell; Fields, Christopher J; Peter Cock; shalabh sharma; bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Taxa Id from blast report

Hi Jason,
             Thanks a lot for you suggestion. I tried that too but i am still not getting super kingdom, actually i don't know how to put super kingdom in the database.
For example:
This is how i formatted my refseq microbial database:
makeblastdb -dbtype prot -in microbial_protein_mask.fasta -out refMicro -taxid_map GItaxa.txt -parse_seqids     ( where GItaxa is the file <GI> <TaxonomyId><newline> ), there is no super kingdom.

So when i run this blast command:
blastx -query test.fas -db refMicro -max_target_seqs 1 -outfmt "6 staxids sscinames sskingdoms"
246200           N/A     N/A
246200           N/A     N/A

I would really appreciate you help.

Thanks
Shalu

On Fri, Apr 19, 2013 at 3:38 PM, Jason Stajich <jason.stajich at gmail.com<mailto:jason.stajich at gmail.com>> wrote:
Did you provide -parse_seqids in the header?

Peter dealt with related things here:
http://blastedbio.blogspot.com/2012/10/my-ids-not-good-enough-for-ncbi-blast.html

Jason

On Apr 19, 2013, at 1:05 PM, shalu sharma <sharmashalu.bio at gmail.com<mailto:sharmashalu.bio at gmail.com>> wrote:


Hi,
   Thanks everyone for you inputs.
@Peter:
I got really excited when i saw that you can even get super kingdom, but
when i tried to test it i just got taxa ids but not the super kingdom. Do
you have any idea whats going wrong?
my command:
blastx  -query test.fas -db /db/ncbiblast/refseq/latest/refseq_protein
-max_target_seqs 1 -outfmt "6 staxids sskingdoms"

output:
246200    N/A
246200    N/A

Thanks
Shalu


On Thu, Apr 18, 2013 at 3:52 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz<mailto:Russell.Smithies at agresearch.co.nz>> wrote:


I agree they have finally listened and added features requested by users
but I've been suggesting they have a compressed output format available
from eutils or genbank for years but have made no headway ;- (
What's so hard about gzip'ping the output? I'm sure it would go a long way
toward solving all the problems we get with truncated replies from queries!!

--Russell

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org<mailto:bioperl-l-bounces at lists.open-bio.org> [mailto:
bioperl-l-bounces at lists.open-bio.org<mailto:bioperl-l-bounces at lists.open-bio.org>] On Behalf Of Fields, Christopher J
Sent: Friday, 19 April 2013 6:26 a.m.
To: Peter Cock
Cc: bioperl-l at lists.open-bio.org<mailto:bioperl-l at lists.open-bio.org>; shalu sharma; shalabh sharma
Subject: Re: [Bioperl-l] Taxa Id from blast report

On Apr 18, 2013, at 11:48 AM, Peter Cock <p.j.a.cock at googlemail.com<mailto:p.j.a.cock at googlemail.com>>
wrote:


On Thu, Apr 18, 2013 at 5:32 PM, shalabh sharma
<shalabh.sharma7 at gmail.com<mailto:shalabh.sharma7 at gmail.com>> wrote:

Hey Peter,
     Thanks a lot, I really appreciate it. I wanted these things
implemented in blast from long time.

Thanks
Shalabh

Me too. You can get the descriptions from the plain text BLAST or XML
output already of course, but they're not so nice to work with.

Peter

NCBI has been much more receptive of user input over the last several
years, much more so than in the past.  I understand the reasoning for
dropping BLAST support (though there were definitely needless bumps in that
process).

chris


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org<mailto:Bioperl-l at lists.open-bio.org>
http://lists.open-bio.org/mailman/listinfo/bioperl-l

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org<mailto:Bioperl-l at lists.open-bio.org>
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich at gmail.com<mailto:jason.stajich at gmail.com>
jason at bioperl.org<mailto:jason at bioperl.org>



=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================




More information about the Bioperl-l mailing list