[Bioperl-l] Converting blast+ output to gff (with gaps)

Cook, Malcolm MEC at stowers.org
Fri Jan 4 20:20:17 UTC 2013


Jim,

Getting to your original question:

> I'm looking for a script that will take one of the blast+ outformats that includes the positions of gaps and mismatches, and .create gff with appropriate subfeatures.

Exactly what/how do you want/expect to encode the blast output as GFF{1,2,2.5,3}??

If GFF3 pe http://www.sequenceontology.org/gff3.shtml then are you hoping to get GFF3 marked up as described in section 'THE GAP ATTRIBUTE' or as in 'ALIGNMENTS'

I would guess not because neither of them have 'subfeatures'.

If you could explain more fully with examples (hand cobbled or borrowed from someone else) of what you expect then I might have a better idea of what options might suit your needs.


~Malcolm


 .-----Original Message-----
 .From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Jim Hu
 .Sent: Friday, January 04, 2013 1:50 PM
 .To: Brian Osborne
 .Cc: Fields, Christopher J; Scott Cain; bioperl-l at bioperl.org
 .Subject: Re: [Bioperl-l] Converting blast+ output to gff (with gaps)
 .
 .Thanks for the replies, but...
 .
 .I can't tell what input formats for the blast results file are supported.  Format 11 and format 6 give no output and no feedback. Putting
 .some diagnostic print statements in the code suggests that I'm not getting any result objects from Bio::SearchIO.
 .
 .The script uses Bio::SearchIO, but does not seem to call the submodules for blast.  Documentation links on the wiki seem to be
 .broken, at least on this page:
 .
 .	http://www.bioperl.org/wiki/Module:Bio::SearchIO
 .
 .Jim
 .
 .
 .On Jan 2, 2013, at 4:53 PM, Brian Osborne wrote:
 .
 .> Scott and Chris,
 .>
 .> I'll test it and see...
 .>
 .> Brian O.
 .>
 .>
 .> On Jan 2, 2013, at 5:26 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
 .>
 .>> It should (I recall using it at one point).  If it doesn't we should fix it so it does.
 .>>
 .>> How does MAKER deal with this?  IIRC it uses (a modified) SearchIO-based method...
 .>>
 .>> chris
 .>>
 .>> On Jan 2, 2013, at 3:32 PM, Scott Cain <scott at scottcain.net> wrote:
 .>>
 .>>> Hi Brian,
 .>>>
 .>>> I was going to suggest the same thing--though that script is fairly
 .>>> old, it's not as old as the blast2gff script in the GBrowse
 .>>> distribution (which probably should be retired).  I believe it
 .>>> supports GFF3, though I don't have any sample data with which to test
 .>>> it to be sure.  I also don't know if it supports BLAST+ input--I
 .>>> haven't kept up with SearchIO (on which search2gff.pl depends); will
 .>>> it accept it?
 .>>>
 .>>> Scott
 .>>>
 .>>>
 .>>> On Wed, Jan 2, 2013 at 3:26 PM, Brian Osborne <bosborne11 at verizon.net> wrote:
 .>>>> Here's one:
 .>>>>
 .>>>> https://github.com/GMOD/GBrowse/blob/master/contrib/blast2gff.pl
 .>>>>
 .>>>> Another one:
 .>>>>
 .>>>> ~/git/bioperl-live>head scripts/utilities/bp_search2gff.pl
 .>>>> #!perl
 .>>>>
 .>>>> # Author:      Jason Stajich <jason-at-bioperl-dot-org>
 .>>>> # Description: Turn SearchIO parseable report(s) into a GFF report
 .>>>> #
 .>>>> =head1 NAME
 .>>>>
 .>>>> bp_search2gff - Turn SearchIO parseable reports(s) into a GFF report
 .>>>>
 .>>>>
 .>>>>
 .>>>> Brian O.
 .>>>>
 .>>>> On Jan 2, 2013, at 2:44 PM, Jim Hu <jimhu at tamu.edu> wrote:
 .>>>>
 .>>>>> I assume this has already been done many times, but I can't seem to find it on bioperl.org or via google.
 .>>>>>
 .>>>>> I'm looking for a script that will take one of the blast+ outformats that includes the positions of gaps and mismatches, and
 .create gff with appropriate subfeatures.
 .>>>>>
 .>>>>> Thanks,
 .>>>>>
 .>>>>> Jim
 .>>>>> =====================================
 .>>>>> Jim Hu
 .>>>>> Professor
 .>>>>> Dept. of Biochemistry and Biophysics
 .>>>>> 2128 TAMU
 .>>>>> Texas A&M Univ.
 .>>>>> College Station, TX 77843-2128
 .>>>>> 979-862-4054
 .>>>>>
 .>>>>>
 .>>>>>
 .>>>>> _______________________________________________
 .>>>>> Bioperl-l mailing list
 .>>>>> Bioperl-l at lists.open-bio.org
 .>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
 .>>>>
 .>>>>
 .>>>> _______________________________________________
 .>>>> Bioperl-l mailing list
 .>>>> Bioperl-l at lists.open-bio.org
 .>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
 .>>>
 .>>>
 .>>>
 .>>> --
 .>>> ------------------------------------------------------------------------
 .>>> Scott Cain, Ph. D.                                   scott at scottcain dot net
 .>>> GMOD Coordinator (http://gmod.org/)                     216-392-3087
 .>>> Ontario Institute for Cancer Research
 .>>> _______________________________________________
 .>>> Bioperl-l mailing list
 .>>> Bioperl-l at lists.open-bio.org
 .>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
 .>>
 .>
 .
 .=====================================
 .Jim Hu
 .Professor
 .Dept. of Biochemistry and Biophysics
 .2128 TAMU
 .Texas A&M Univ.
 .College Station, TX 77843-2128
 .979-862-4054
 .
 .
 .
 ._______________________________________________
 .Bioperl-l mailing list
 .Bioperl-l at lists.open-bio.org
 .http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list