ACD file and emboss.default file syntax

Guy Bottu gbottu at ben.vub.ac.be
Fri Feb 21 16:18:14 UTC 2003


from : BEN

> 
> I am cleaning up the parsing of both ACD files and the emboss.default 
> files. This includes adding diagnostic messages to say what problems 
> were found and to report the line number (and filename).
> 
>
> 
> There are also differences in the definitions of comments. In ACD files 
> any text after a "#" is ignored. In emboss.default comments must start 
> at the beginning of the line. This seems preferable as occasionally a 
> "#" character could be useful in a definition.

I agree with that.

By the way : the file eprimer3.acd contains : 
help: "The maximum allowed melting temperature of the amplicon. Product Tm i
s calculated using the formula from Bolton and McCarthy, PNAS 84:1390 (1962) as 
presented in Sambrook, Fritsch and Maniatis, Molecular Cloning, p 11.46 (1989, C
SHL Press). \ Tm = 81.5 + 16.6(log10[Na+]) + .41*(%GC) - 600/length \
...
The [Na+] turned out to be "toxic", I had to replace it by {Na+}. Maybe make 
that the parser can distinguish [] signs that are part of the syntax from those 
that are part of some definition.
> 
> Extra questions are:
> 
>
> 7. Should the database (and any other emboss.default) attribute names be 
> abbreviated (see question 5)?
> 
To allow both "swissprot" and "sw" as alternative names I now duplicate the 
definition. Allowing for abbreviated database names could be a solution. But 
perhaps not that good. Might be confusing. And what about "imgtmhc" / "mhc" ?
A suggestion is to add an attribute "altname", so that you could have :

DB unannotated [ type: N  comment: 'EMBL unannotated/unclassified'
    altname: unc,un,unclassified  .....
    
Other things on "wish list" :

- to allow the "nullok" attribute for all objects rather then just some.
  e.g. it can happen that whether program must input sequence depends on setting
  of other parameters. seq object however always needs input, so only "hack" now 
  is to write silly default value in ACD file
  
- extend input/output in several format for sequence/feature/alignment/
  structure to other types of data : symbol comparison tables (GCG, BLAST,
  SIM,...) codon_usage_tables (CUTG, GCG, ...)
  
	Sincerely,
	Guy Bottu



    
    



More information about the emboss-dev mailing list