[Bioperl-l] Re: [Bioperl-guts-l] bioperl-live/Bio/FeatureIO gff.pm, 1.16, 1.17

Lincoln Stein lstein at cshl.edu
Tue Nov 23 13:57:01 EST 2004


A group is absolutely not required to end with a ### directive.  It is 
just a hint to the GFF parser that it no longer has to keep track of 
previously-loaded features in case a child appears somewhere toward 
the end of the file.

Lincoln

On Tuesday 23 November 2004 01:16 pm, Chris Mungall wrote:
> Is a group defined as a set of connected features?
>
> Is the group required to end with a ### directive? This could be
> checked for automatically by testing whether each feature is
> connected to the current feature graph. Or do we want to allow the
> data producer to define their own concept of grouping (if so this
> probably wouldn't round trip).
>
> What about singleton features such as SNPs - is a SNP in an
> intergenic area a group unto itself? (if so, we shouldn't require
> the ### directive after each one)
>
> Note that there's already code for reconstituting the SeqFeature
> hierarchy from the ID/Parent tags in Bio::SeqFeature::Tools
>
> Cheers
> Chris
>
> On Tue, 23 Nov 2004, Steffen Grossmann wrote:
> > Dear Allen, dear Scott,
> >
> > before we write a next_sequence method, we should have something
> > which is able to reconstruct the a set of hierarchically nested
> > features. Any suggestions for method names? How about next_group?
> > next_group gives back an array of features (which represent the
> > top-level features, the lower features appear as subfeatures). A
> > group is ended by a ### directive (or by the EOF). A
> > next_sequence method could then also use this nesting...
> >
> > I have ideas how to realize the implementation. Tell me what you
> > think about it and I can start doing it.
> >
> > Steffen
> >
> > Allen Day wrote:
> > >there should be a next_sequence method.  i wrote this into
> > >Bio::Tools::GFF, we should pretty much be able to just
> > > copy/paste it over.
> > >
> > >-allen
> > >
> > >On Tue, 16 Nov 2004, Scott Cain wrote:
> > >>Update of /home/repository/bioperl/bioperl-live/Bio/FeatureIO
> > >>In directory pub.open-bio.org:/tmp/cvs-serv5204
> > >>
> > >>Modified Files:
> > >>	gff.pm
> > >>Log Message:
> > >>added stuff to support fasta and target processing.  The
> > >> quesion remains what to do with this data once you have
> > >> it--particularly the fasta data.  Should there be (or is
> > >> there) a next_sequence() method?
> > >>
> > >>
> > >>Index: gff.pm
> > >>===============================================================
> > >>==== RCS file:
> > >> /home/repository/bioperl/bioperl-live/Bio/FeatureIO/gff.pm,v
> > >> retrieving revision 1.16
> > >>retrieving revision 1.17
> > >>diff -C2 -d -r1.16 -r1.17
> > >>*** gff.pm	16 Nov 2004 16:22:53 -0000	1.16
> > >>--- gff.pm	16 Nov 2004 19:35:09 -0000	1.17
> > >>***************
> > >>*** 211,215 ****
> > >>    return undef unless $gff_string;
> > >>
> > >>!   if($gff_string =~ /^##/){
> > >>      $self->_handle_directive($gff_string);
> > >>      return $self->next_feature();
> > >>--- 211,215 ----
> > >>    return undef unless $gff_string;
> > >>
> > >>!   if($gff_string =~ /^##/ or $gff_string =~ /^>/){
> > >>      $self->_handle_directive($gff_string);
> > >>      return $self->next_feature();
> > >>***************
> > >>*** 248,255 ****
> > >>    }
> > >>
> > >>!   elsif($directive eq 'FASTA'){
> > >>      $self->warn("'##$directive' directive handling not yet
> > >> implemented"); !     while($self->_readline()){
> > >>!       #suck up the rest of the file
> > >>      }
> > >>    }
> > >>--- 248,266 ----
> > >>    }
> > >>
> > >>!   elsif($directive eq 'FASTA' or $directive =~ /^>(.+)/){
> > >>!     my $fasta_directive_id = $1 if $1;
> > >>      $self->warn("'##$directive' directive handling not yet
> > >> implemented"); !     local $/ = '>';
> > >>!     while(my $read = $self->_readline()){
> > >>!        chomp $read;
> > >>!        my $fasta_id;
> > >>!        my @seqarray = split /\n/, $read;
> > >>!        if ($fasta_directive_id) {
> > >>!          $fasta_id = $fasta_directive_id;
> > >>!          $fasta_directive_id = '';
> > >>!        } else {
> > >>!          $fasta_id = shift @seqarray;
> > >>!        }
> > >>!        my $seq = join '', @seqarray;
> > >>      }
> > >>    }
> > >>***************
> > >>*** 357,363 ****
> > >>        );
> > >>
> > >>!       if ($strand eq '+') {
> > >>          $strand = 1;
> > >>!       } elsif ($strand eq '-') {
> > >>          $strand = -1;
> > >>        }
> > >>--- 368,374 ----
> > >>        );
> > >>
> > >>!       if ($strand && $strand eq '+') {
> > >>          $strand = 1;
> > >>!       } elsif ($strand && $strand eq '-') {
> > >>          $strand = -1;
> > >>        }
> > >>
> > >>_______________________________________________
> > >>Bioperl-guts-l mailing list
> > >>Bioperl-guts-l at portal.open-bio.org
> > >>http://portal.open-bio.org/mailman/listinfo/bioperl-guts-l
> > >
> > >_______________________________________________
> > >Bioperl-l mailing list
> > >Bioperl-l at portal.open-bio.org
> > >http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724

NOTE: Please copy Sandra Michelsen <michelse at cshl.edu> on
all emails regarding scheduling and other time-critical topics.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20041123/165697db/attachment.bin


More information about the Bioperl-l mailing list