[EMBOSS] Antwort: protein sequence format question
Tao Song
tao.song at calibrant.com
Wed Sep 6 13:14:02 UTC 2006
Hi Divad,
Thanks so much for your help!
Regards,
Tao
----- Original Message -----
From: <David.Bauer at SCHERING.DE>
To: "Tao Song" <tao.song at calibrant.com>
Cc: <emboss at lists.open-bio.org>; <emboss-bounces at lists.open-bio.org>
Sent: Wednesday, September 06, 2006 1:45 AM
Subject: Antwort: [EMBOSS] protein sequence format question
>
> Hi,
>
> the file which you try to use is a mysql dump from the biomart database.
> So this is not a format which you can use with EMBOSS.
> But the uniprot is also available in other formats.
> Please have a look at the directory:
> ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/
>
> There you will find uniprot in fasta and embl (.dat.gz) format which can
> be used with EMBOSS.
> You can also index these files with the EMBOSS tools dbxfasta or dbxflat
> so you can efficently retrieve individual sequences from the database.
>
> Some more information about the sequence format supported by EMBOSS you
> can find at the emboss documentation pages:
> http://emboss.sourceforge.net/docs/themes/SequenceFormats.html
>
>
> HTH,
> David.
>
> emboss-bounces at lists.open-bio.org schrieb am 06/09/2006 01:54:24:
>
>> Hi,
>>
>> I try to use DIGEST function in EMBOSS for tryptic digest
>> of protein sequence. The sequence file I download from the following
>> link:
>> ftp://ftp.ebi.ac.
>>
> uk/pub/databases/biomart/current/uniprot_mart_17/uniprot_sequence__sequence__main.
>
>> txt.gz
>>
>> It is a tab delimited flat file which includes all protein
>> sequences. It seems
>> that it is not any of the formats EMBOSS support. I wonder is it
>> still possible
>> to use DIGEST function?
>
>
>
More information about the EMBOSS
mailing list