[EMBOSS] Does seqret have limitations ?
David.Bauer at SCHERING.DE
David.Bauer at SCHERING.DE
Wed Dec 17 07:58:36 UTC 2003
Another explanation could be if there is a Contig type entry in the GenBank
File.
The Contig GenBank entries do not contain any sequence. The feature table
contains only references to other sequence entries.
Seqret can not handle this type of virtual sequence entries.
David.
"Stefanie Lager"
<stefanielager at f
astmail.ca> An: emboss at embnet.org
Gesendet von: Kopie:
owner-emboss at hgm Thema: RE: [EMBOSS] Does seqret have limitations ?
p.mrc.ac.uk
17.12.03 07:05
It sounds as if it's problems with a single sequence in the file. Try
removing the sequence it hangs on or try split the original file in
parts and see if it there is a single sequence it hangs on. Other
programs can have problems with end of line characters, but this
doesn't sound like that.
Stefanie
> Dear Simon,
>
> I don't receive any error from seqret, it simply stops just as if
> it was correctly finished.
> The file is not bigger than 2Gb:
> $ du -sk file.gbk
> 74680 file.gbk
>
> Even with the cat command you sent me, I only get 17143 sequences
> formatted in GCG format. (it is the same if I try to convert in
> fasta format)
>
> $ grep -c "Check" file.gcg
> 17143
> $ grep -c "LOCUS" file.gbk
> 26045
>
> $ seqret file.gbk -osformat fasta -outseq test
> Reads and writes (returns) sequences
> $ grep -c ">" test
> 17143
>
> If anybody has an idea...
>
> Thanks a lot,
>
> Caroline.
>
>
>
> -----Original Message-----
> From: simon andrews (BI) [mailto:simon.andrews at bbsrc.ac.uk]
> Sent: mardi, 16. décembre 2003 17:47
> To: 'emboss at embnet.org'
> Subject: RE: [EMBOSS] Does seqret have limitations ?
>
>
>
> -----Original Message-----
> From: Barretto,Caroline,LAUSANNE,NRC/BAS
> [mailto:Caroline.Barretto at rdls.nestle.com]
> Sent: 16 December 2003 16:12
> To: emboss at embnet.org
> Subject: [EMBOSS] Does seqret have limitations ?
>
>
>> Dear all,
>>
>> Did anybody notice that the seqret program seems to
>> be limited by the number of sequences to convert ? I
>> am trying to convert 1 file containing 23000 genbank
>> sequences into GCG format.
>>
>> Do you have a suggestion for that ?
>
> Seqret is not limited by number of files. I routinely pass the
> whole of EMBL through seqret and it works fine. What error do you
> get when seqret stops? Could it just be that there is a malformed
> entry part way through your file?
>
> Is the file you are trying to convert >2Gb in size? If so this
> could be the reason for the failure rather than seqret being
> limited by the number of sequences. In this case though I thought
> that the failure would happen when the file was first opened and
> not after a certain number of sequences had passed through.
>
> If the problem is a large file you might be able to get round this
> by using a pipe to get information into seqret. Try
>
> cat your_genbank_file.gb | seqret -filter -osf gcg >
> your_gcg_file.gcg
>
> This should work as long as your OS version of cat and your shell
> can handle large files.
>
> Hope this helps
_________________________________________________________________
http://fastmail.ca/ - Fast Secure Web Email for Canadians
More information about the EMBOSS
mailing list