[Bioperl-l] Bio::SeqIO Genbank + EMBL unquoted values

Nadeem Faruque faruque at ebi.ac.uk
Thu Nov 27 06:04:13 EST 2003


Prompted by a genome submittor that had used BioPerl, I wondered why he couldn't 
get BioPerl to write out unquoted evidence qualifier values.

Maybe I've got this wrong, but I think that the feature table writing functions 
are oversimplified on this point:-

In sub _print_EMBL_FTHelper  (and sub _print_GenBank_FTHelper)
it appears only to think that only qualifier values that are just numbers are 
unquoted:-
...
            elsif( $always_quote == 1 || $value !~ /^\d+$/ ) {
               my $pat = $value =~ /\s/ ? '\s|$' : '.|$';
           $self->_write_line_EMBL_regex("FT                   ",
                         "FT                   ",
                         "/$tag=\"$value\"",$pat,80);
            }
            else {
               $self->_write_line_EMBL_regex("FT                   ",
                         "FT                   ",
                         "/$tag=$value",'.|$',80); #'



Each of the folloing qualifiers accepts a non-numeric single token that should 
be unquoted:-
  /direction=left, right, or both
  /estimated_length=unknown though an actual number will be accepted next year
  /evidence=experimental or not_experimental
  /label=*** single token used to permanently tag a feature
         (for use within EMBL, eg for joins that span entries.
          External use not advised)
  /mod_base=m5c for example, the abbreviation for a modified nucleotide base
  /number=1e for example, a single token used as a exon/intron number
         (should be a number but exon numbering is more chaotic than that)
  /rpt_type=tandem, inverted, flanking, terminal, direct, dispersed, and other
  /rpt_unit can either accept quoted text (/rpt_unit="aagggc" )
          or a location value (/rpt_unit=202..245 )

NB The other qualifiers that are unusual are:-
  /anticodon=(pos:***,aa:***)
  /citation=[***] - the number of the citation
  /codon=(seq:"***", aa:***)
  /cons_splice=(5'site:***, 3'site:***)
  /transl_except=(pos:***,aa:***)
  /usedin=***:*** - like /label, this shouldn't really be created externally.


Further details are available in teh feature table document or at 
<http://www.ebi.ac.uk/embl/WebFeat/index.html>

Nadeem

-- 
S.M. Nadeem N. Faruque
EMBL Nucleotide Database Curation Team
EMBL Outstation
Tel: +44 1799 494611                     Fax: +44 1799 494472
The European Bioinformatics Institute    URL: http://www.ebi.ac.uk/
Email for data submissions: datasubs at ebi.ac.uk
Email for updates: update at ebi.ac.uk
=============================================================================




More information about the Bioperl-l mailing list