[EMBOSS] Does seqret have limitations ?

Barretto,Caroline,LAUSANNE,NRC/BAS Caroline.Barretto at rdls.nestle.com
Tue Dec 16 17:26:53 UTC 2003


Dear Simon,

I don't receive any error from seqret, it simply stops just as if it was
correctly finished. 
The file is not bigger than 2Gb:
$ du -sk file.gbk
74680   file.gbk

Even with the cat command you sent me, I only get 17143 sequences formatted
in GCG format. (it is the same if I try to convert in fasta format)

$ grep -c "Check" file.gcg
17143
$ grep -c "LOCUS" file.gbk
26045

$ seqret file.gbk -osformat fasta -outseq test
Reads and writes (returns) sequences
$ grep -c ">" test
17143

If anybody has an idea...

Thanks a lot,

Caroline.



-----Original Message-----
From: simon andrews (BI) [mailto:simon.andrews at bbsrc.ac.uk]
Sent: mardi, 16. décembre 2003 17:47
To: 'emboss at embnet.org'
Subject: RE: [EMBOSS] Does seqret have limitations ?



-----Original Message-----
From: Barretto,Caroline,LAUSANNE,NRC/BAS
[mailto:Caroline.Barretto at rdls.nestle.com] 
Sent: 16 December 2003 16:12
To: emboss at embnet.org
Subject: [EMBOSS] Does seqret have limitations ?


> Dear all,
> 
> Did anybody notice that the seqret program seems to 
> be limited by the number of sequences to convert ? I 
> am trying to convert 1 file containing 23000 genbank 
> sequences into GCG format.
>
> Do you have a suggestion for that ?

Seqret is not limited by number of files.  I routinely pass the whole of
EMBL through seqret and it works fine.  What error do you get when seqret
stops?  Could it just be that there is a malformed entry part way through
your file?

Is the file you are trying to convert >2Gb in size?  If so this could be the
reason for the failure rather than seqret being limited by the number of
sequences.  In this case though I thought that the failure would happen when
the file was first opened and not after a certain number of sequences had
passed through.

If the problem is a large file you might be able to get round this by using
a pipe to get information into seqret.  Try

cat your_genbank_file.gb | seqret -filter -osf gcg > your_gcg_file.gcg

This should work as long as your OS version of cat and your shell can handle
large files.

Hope this helps

Simon.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20031216/cbcd944b/attachment-0001.html>


More information about the EMBOSS mailing list