[Bioperl-l] The Feature changes which have broken compatibility
Chris Mungall
cjm at fruitfly.org
Wed Feb 2 17:52:24 EST 2005
On Wed, 2 Feb 2005, Allen Day wrote:
> > I'm not sure how many people are aware how this works - here is what
> > happens under the hood:
> >
> > 1) Bio::FeatureIO::gff is initialized. This entails establishing a
> > connection to sourceforge to download the sequence ontology. This of
> > course will not work if you are offline. Even if you are online, it
> > doesn't seem to work for me and is of course dependent on the vagaries of
> > whether the sourceforge & the sourceforge mirror is working. Even if this
> > all works, there is an initial start-up lag which may be unacceptable to
> > some applications. Also, not everyone using bioperl is in a country with
> > fast internet access and local-ish sourceforge mirror.
> >
> > In addition, it hardcodes metadata about the ontology in bioperl (see
> > Bio::Ontology::DocumentRegistry) which is asking for trouble.
> >
> > In addition, it downloads the ontology in a legacy deprecated format,
> > because that's all bioperl currently supports. Also asking for trouble
> > further down the line
> >
> > Why is it doing all this? Purely in order to check that the feature types
> > provided in the GFF file are valid SOFA terms. Look, I already know all
> > the GFF3 files I want to parse have valid SOFA types. If I want to check,
> > I'll do this myself thanks, I don't want bioperl to secretly do it for me
> > in a hokey way that requires me being online and in the USA, every single
> > time I parse a file.
> >
> > In fact, there is already a script for validating a GFF3 file, in the SO
> > software repository (which uses Bio::Tools::GFF) which does a much more
> > thorough job, checking feature parentage too.
> >
> > What happened to modularity? You know, parsing in a parser, verification
> > in a verifier.
> >
> > 2) it starts parsing features, assigning Bio::Ontology::Term objects to
> > each feature (the type). This entails having Graph::Directed, which is
> > what Jason is alluding to. Not that bad in itself, but unneccessary for
> > the majority of apps that just want to parse GFF
> >
> > Is it just me that thinks this is madness? Can someone please make it
> > stop?
>
> Correct, but this behavior is disabled by default. From the
> FeatureIO/gff.pm POD:
>
> my $featureOut = Bio::FeatureIO->new(-format => 'gff',
> -version => 3,
> -fh => \*STDOUT,
> -validate_terms => 1, #boolean. validate ontology
> #terms online? default 0 (false).
> );
>
> If you don't turn this on, it merely creates a
> Bio::Annotation::OntologyTerm object with the identifer or term name from
> the GFF file -- no validation attempted.
This doesn't seem to be the case - are you sure you have this code checked
in?
> Furthermore, if you do want to validate against the SO/SOFA ontologies,
> but you don't want to rely on the live ontologies on the web, you can
> parse SO/SOFA from local files (in deprecated format, admittedly, but this
> isn't my doing) first. That fills the Bio::Ontology cache so network
> queries don't happen.
Even once you check in your changes so this is no longer the default
behaviour, I still strongly believe that this should be moved out of the
parser altogether
> -Allen
>
More information about the Bioperl-l
mailing list