[Bioperl-l] BLAST output parsing

Barry Moore barry.moore at genetics.utah.edu
Thu Nov 1 04:03:01 UTC 2007


Swapna-

If you are using NCBI fasta files you can use files from NCBIs gene  
database to map your gene IDs to names and organisms.  Look in  
particular at the files gene2accession, gene2refseq, and gene_info.   
For example, if you had RefSeq protein IDs like NP_123456, you could  
use gene2refseq to map those RefSeq accessions to gene IDs and then  
gene_info to map the gene IDs to organisms and gene name.

B

On Oct 31, 2007, at 7:27 PM, Torsten Seemann wrote:

> Swapna,
>
>> I am new to bioperl.  I did BLAST search of ~4000 genes and I need  
>> to parse
>> it.  I did use -m 9 option to get a tabular information of the  
>> blast data.
>> But it does not include the gene names or the names of the  
>> organisms of each
>> hit.  Are there any parsers that can do this job ??
>
> The -m 9 tabular output does not include gene descriptions and
> organisms. It only includes the "gene id" that was present immediately
> after the ">" sign in the FASTA file that was used to create the BLAST
> database you specified with the -d option when you ran BLAST.
>
> Hence, no parser will help you. You either have to re-do the BLAST
> with a different -m value that includes the information you desire, or
> write code to convert your gene IDs into what you want.
>
> --
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list