[Bioperl-l] bp_search2gff.pl

Chris Fields cjfields at uiuc.edu
Fri Oct 5 19:51:20 UTC 2007


We might want to file this as a bug so we can track it.

The core devs have been mulling over the state of GFF/GFF3 in  
BioPerl; proper handling of any SearchIO data is certainly included  
in that.  I believe some road forward is to be planned soon (after  
Genome Informatics).

chris

On Oct 5, 2007, at 2:35 PM, Eric Just wrote:

> Hello,
>
> I have been playing with the bp_search2gff.pl script (on HEAD of
> bioperl-live).    There are a couple of issues I was wondering about.
>
> One is the ID that gets generated for a match feature when the --match
> option is set.   The ID is  set to the ID of the query sequence.  This
> can be problematic if you are representing the query sequence and the
> blast hit in the same gff file.  When using the resultant gff file for
> loading into Chado, it also creates a problem if you have more than
> one hit for a given query sequence, for example if you ran two
> different analyses that each had a hit for a given query.  Would it be
> possible to have an option to create a unique ID for match features.
> One suggestion could be to create an ID based on the ID of the query +
> the id of the hit + the source
>
> As long as two different analyses were loaded as different sources,
> this would ensure unique IDs for the match features.
>
>
> Also, is there a reason for writing the Target string as
>
> Target=Sequence:SOME_ID
>
> as opposed to
>
> Target=SOME_ID
>
>
> The latter seems a little more in line with the gff3 spec and plays a
> little nicer with the GMOD tools.
>
> Thanks for looking into this.
>
> Eric
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list