pmr at ebi.ac.uk
Fri Jan 10 14:36:53 UTC 2003
> I have a single file containing 164 sequences in embl format
> retrieved by SRS. I want to split all the entries in distinct files. I
> have tried to use "seqretsplit", but it do not report FT lines. For example:
> My entry retrieved by means of SRS is:
> ID AY045754_4; parent: AY045754AC AY045754;
> FT rRNA 610. .772
> FT /product="5.8S ribosomal RNA"
This must be a GCG format database, indexed in SRS ... you have gaps in
the ".." format.
Ideally SRS would fix that ... but EMBOSS could cope with some small
changes. Anyway, an SRS fix is non-trivial because it is reporting
exactly the text that GCG stores.
> Anyone knows an application or a script that allows me to split
> sequences (embl format) in different files without losing FT lines?
Simplest in your case would be a script that changes ". ." to ".."
before passing the data to EMBOSS.
Meanwhile, I will take a look at fixing this for a future release.
More information about the EMBOSS