[Bioperl-l] Question about embl format
Lincoln Stein
lstein at cshl.org
Thu Apr 17 18:56:13 EDT 2003
OK, so what to do about primary_tags that are >= 15 letters, since BioPerl
doesn't enforce a size limit on primary_tags? If I implement truncation at
the write_seq level, then we'll lose round-tripping.
Oh well. I'll just have to do it unless anyone sees a way around it.
Lincoln
On Thursday 17 April 2003 11:59 am, Ewan Birney wrote:
> On Thu, 17 Apr 2003, Lincoln Stein wrote:
> > Hello,
> >
> > The "sequence dumper" plugin for the Generic Genome Browser has been
> > crashing when making an EMBL dump of a particular region of the worm
> > genome. The issue is a "Transposon_insertion" feature, which exceeds the
> > 15 character limit for EMBL feature tags. If I remove the
> > Bio::SeqIO::embl check for this limit, I get an output that looks like
> > this:
> >
> > ...
> > FT Transposon_insertion complement(13204595..13204596)
> > FT /score=""
> > FT /group="cxP4108"
> > FT /id=7726466
> > FT /method="Transposon_insertion"
> > FT /source="Allele"
> > FT /phase=""
> > FT repeat 13204572..13204602
> > FT /score=80
> > FT /group=""
> > FT /notes="loop 283"
> > FT /id=7775180
> > FT /method="repeat"
> > FT /source="inverted"
> > FT /phase=""
> > FT /note="score=80"
> > ...
> >
> > My question is whether this is acceptable embl format? If not, I will
> > have to truncate feature type names at 15 characters, but this is going
> > to lose information.
>
> Looks like the defn says <15 letters
>
> Feature table components, including feature keys, qualifiers, accession
> numbers, database name abbreviations, feature labels, and location
> operators, are all named following the same conventions. Component names
> may be no more than 20 characters long (Feature keys 15, Feature
> qualifiers 20) and must contain at least one letter. While case should
> not be regarded as significant in comparing feature labels ('Prot1' and
> 'pROT1' are the same), the databanks will preserve the case of labels as
> originally annotated. The following characters are permitted to occur in
> feature table component names:
>
>
>
> From:
>
>
> http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
>
> > Lincoln
> >
> > --
> > ========================================================================
> > Lincoln D. Stein Cold Spring Harbor Laboratory
> > lstein at cshl.org Cold Spring Harbor, NY
> > ========================================================================
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
>
> -----------------------------------------------------------------
> Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
> <birney at ebi.ac.uk>.
> -----------------------------------------------------------------
--
========================================================================
Lincoln D. Stein Cold Spring Harbor Laboratory
lstein at cshl.org Cold Spring Harbor, NY
========================================================================
More information about the Bioperl-l
mailing list