[Bioperl-l] GFF file output missing semicolon
Wes Barris
wes.barris at csiro.au
Sun Nov 23 19:35:46 EST 2003
Jason Stajich wrote:
> I think that the gff2 dumping was not particularly good - I think I made
> some fixes to clean it up on the main trunk in the last few months. I can
> certainly dump with Tools:GFF and load into Gbrowse just fine with
> the current code. Wes you might try with bioperl 1.3.x series
> Bio::Tools::GFF instead.
I could try but this is running on a production server and I had a heck
of a time trying to find a working combination of bioperl and gbrowse
that would work together. I think that at the time, the only combo I
could get to work was bioperl-1.2.2 and gbrowse-1.50. If I installed
bioperl-1.2.3, what version of gbrowse is guaranteed to work with that?
>
> -jason
>
> On Fri, 21 Nov 2003, Lincoln Stein wrote:
>
>
>>Hi,
>>
>>The GFF2 spec specifies that the semicolon separates tag/value pairs. It does
>>not say that the last tag/value should be terminated by a semicolon. It also
>>specifies that any amount of whitespace can occur around the semicolon.
>>
>>Lincoln
>>
>>On Thursday 20 November 2003 11:19 pm, Wes Barris wrote:
>>
>>>Hi,
>>>
>>>I have written a bioperl program that parses blast files and generates
>>>a gff file. I have everything working except there is one small detail
>>>that I have not been able to figure out. When generating each line
>>>of gff output, the semicolon is left off at the end of the Accession
>>>name. Here is a sample line from a gff file that I generated:
>>>
>>>AF354168 mirseeker pred_miRNA 188152 188251 198 -
>>> . Note "mirseeker score 17.58" ; Accession
>>>"s-h_19_r_99330000-99363000"
>>>
>>>Notice that:
>>>
>>>1) There are three space characters after the note and the semicolon
>>> that occurs before "Accession".
>>>
>>>2) At the end of the line, after the Accession, there are three space
>>> characters and no semicolon. Without that semicolon, the genome
>>> browser doesn't display the "rollover" information properly.
>>>
>>>3) The "Note" field is written before the "Accession" field. I thought
>>> that the Accession should come first.
>>>
>>>Here is the relevant portion of my code:
>>>
>>> while( my $hsp = $hit->next_hsp ) {
>>> my $strand = 1;
>>> $strand = -1 if ($hsp->strand('query') == -1 ||
>>>$hsp->strand('hit') == -1); my $feature = new Bio::SeqFeature::Generic(
>>> -source_tag=>$source,
>>> -primary_tag=>$feature_type,
>>> -start=>$hsp->start('hit'),
>>> -end=>$hsp->end('hit'),
>>> -score=>$hit->raw_score,
>>> -strand=>$strand,
>>> -tag=>{
>>> Accession=>$result->query_name,
>>> Note=>$result->query_description,
>>> }
>>> );
>>> $feature->seq_id($hit->accession);
>>> $gffio->write_feature($feature); #Bio::SeqFeatureI
>>> }
>>>
>>>Perhaps I am not adding the "Accession" and "Note" fields properly???
>>
>>
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
--
Wes Barris
E-Mail: Wes.Barris at csiro.au
More information about the Bioperl-l
mailing list