[EMBOSS] Does seqret have limitations ?

Barretto,Caroline,LAUSANNE,NRC/BAS Caroline.Barretto at rdls.nestle.com
Wed Dec 17 10:27:39 UTC 2003


Thank you very much to everybody, the problem was, as some of you suspected,
in the genbank file...

Best wishes

Caroline.

 

                      "Stefanie Lager"

                      <stefanielager at f

                      astmail.ca>              An:      emboss at embnet.org

                      Gesendet von:            Kopie:

                      owner-emboss at hgm         Thema:   RE: [EMBOSS] Does
seqret have limitations ?            
                      p.mrc.ac.uk

 

 

                      17.12.03 07:05

 

 





It sounds as if it's problems with a single sequence in the file. Try
removing the sequence it hangs on or try split the original file in
parts and see if it there is a single sequence it hangs on. Other
programs can have problems with end of line characters, but this
doesn't sound like that.

Stefanie

> Dear Simon,
>
> I don't receive any error from seqret, it simply stops just as if
> it was correctly finished.
> The file is not bigger than 2Gb:
> $ du -sk file.gbk
> 74680   file.gbk
>
> Even with the cat command you sent me, I only get 17143 sequences
> formatted in GCG format. (it is the same if I try to convert in
> fasta format)
>
> $ grep -c "Check" file.gcg
> 17143
> $ grep -c "LOCUS" file.gbk
> 26045
>
> $ seqret file.gbk -osformat fasta -outseq test
> Reads and writes (returns) sequences
> $ grep -c ">" test
> 17143
>
> If anybody has an idea...
>
> Thanks a lot,
>
> Caroline.
>
>
>
> -----Original Message-----
> From: simon andrews (BI) [mailto:simon.andrews at bbsrc.ac.uk]
> Sent: mardi, 16. décembre 2003 17:47
> To: 'emboss at embnet.org'
> Subject: RE: [EMBOSS] Does seqret have limitations ?
>
>
>
> -----Original Message-----
> From: Barretto,Caroline,LAUSANNE,NRC/BAS
> [mailto:Caroline.Barretto at rdls.nestle.com]
> Sent: 16 December 2003 16:12
> To: emboss at embnet.org
> Subject: [EMBOSS] Does seqret have limitations ?
>
>
>> Dear all,
>>
>> Did anybody notice that the seqret program seems to
>> be limited by the number of sequences to convert ? I
>> am trying to convert 1 file containing 23000 genbank
>> sequences into GCG format.
>>
>> Do you have a suggestion for that ?
>
> Seqret is not limited by number of files.  I routinely pass the
> whole of EMBL through seqret and it works fine.  What error do you
> get when seqret stops?  Could it just be that there is a malformed
> entry part way through your file?
>
> Is the file you are trying to convert >2Gb in size?  If so this
> could be the reason for the failure rather than seqret being
> limited by the number of sequences.  In this case though I thought
> that the failure would happen when the file was first opened and
> not after a certain number of sequences had passed through.
>
> If the problem is a large file you might be able to get round this
> by using a pipe to get information into seqret.  Try
>
> cat your_genbank_file.gb | seqret -filter -osf gcg >
> your_gcg_file.gcg
>
> This should work as long as your OS version of cat and your shell
> can handle large files.
>
> Hope this helps

_________________________________________________________________
    http://fastmail.ca/ - Fast Secure Web Email for Canadians



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20031217/725dae20/attachment-0001.html>


More information about the EMBOSS mailing list