[Bioperl-l] blast output -> blast -m8 output
Amir Karger
akarger at CGR.Harvard.edu
Wed Jan 11 16:13:41 EST 2006
> From: Jason Stajich [mailto:jason.stajich at duke.edu]
>
> The existing search2table script in scripts/searchio does this for
> you - I don't think there is a writer plugin but there could be.
Ah nice. but:
-------------------
>perl bioperl-1.5.0-RC1/scripts/searchio/search2table.PLS seqs.blp > zzz
>more zzz
Bacteriophage_1[M19348] ref|NP_037061.1| 40.32 62 27 4
28 89 1050 1107 6e-05 46.6
Bacteriophage_1[M19348] ref|XP_193814.5| 48.89 45 16 6
57 95 320 364 0.001 42.7
Bacteriophage_1[M19348] ref|XP_912463.1| 48.89 45 16 6
57 95 866 910 0.001 42.7
Bacteriophage_1[M19348] ref|XP_619329.2| 48.89 45 16 6
57 95 676 720 0.001 42.7
C.elegans_1_[Z49071] ref|XP_917828.1| 29.61 412 183 48
40 410 52 456 6e-43 173
C.elegans_1_[Z49071] gb|AAI10184.1| 31.99 347 147 23 40
373 53 389 6e-42 169
>more seqs.m8
Bacteriophage_1[M19348] gi|6978677|ref|NP_037061.1| 40.32 62 33
1 28 89 1050 1107 6e-05 46.6
Bacteriophage_1[M19348] gi|82958039|ref|XP_193814.5| 48.89 45 17
1 57 95 320 364 0.001 42.7
Bacteriophage_1[M19348] gi|82958037|ref|XP_912463.1| 48.89 45 17
1 57 95 866 910 0.001 42.7
Bacteriophage_1[M19348] gi|82957449|ref|XP_619329.2| 48.89 45 17
1 57 95 676 720 0.001 42.7
C.elegans_1_[Z49071] gi|82802536|ref|XP_917828.1| 29.61 412 242
9 40 410 52 456 6e-43 173
C.elegans_1_[Z49071] gi|82571607|gb|AAI10184.1| 31.99 347 213
11 40 373 53 389 6e-42 169
-----------------
I know we can't get around the problem of the IDs, since blast & blast -m8
give different IDs. But columns 5 and 6 (mismatches, gap openings) are
consistently different. Is search2table not trying to mimic -m8 exactly, or
is this a bug?
Apologies if this is due to using bioperl 1.4 and the PLS script from
1.5.0-RC1. That's what I have on hand.
>
> Note that if you just using BLAST you will find that the blast2table
> script that is included in the BLAST book (see the O'Reilly website
> for the book and download the code examples) will also generate this
> sort of thing for you and will be many times faster than SearchIO
> code.
I could steal that. But I was thinking that if NCBI changes the BLAST
format, bioperl may upgrade while the dead trees code won't.
- Amir Karger
Computational Biology Group
Bauer Center for Genomics Research
Harvard University
617-496-0626
> There is also an equivalent hmmer_to_table and
> fastam9_to_table which are very fast re-formatters that don't
> actually use SearchIO since one is just trying to get the
> very simple
> data out.
>
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12/
>
>
More information about the Bioperl-l
mailing list