[EMBOSS] Antwort: protein sequence format question

Tao Song tao.song at calibrant.com
Wed Sep 6 13:14:02 UTC 2006


Hi Divad,

     Thanks so much for your help!

     Regards,

     Tao

----- Original Message ----- 
From: <David.Bauer at SCHERING.DE>
To: "Tao Song" <tao.song at calibrant.com>
Cc: <emboss at lists.open-bio.org>; <emboss-bounces at lists.open-bio.org>
Sent: Wednesday, September 06, 2006 1:45 AM
Subject: Antwort: [EMBOSS] protein sequence format question


>
> Hi,
>
> the file which you try to use is a mysql dump from the biomart database.
> So this is not a format which you can use with EMBOSS.
> But the uniprot is also available in other formats.
> Please have a look at the directory:
> ftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/
>
> There you will find uniprot in fasta and embl (.dat.gz) format which can
> be used with EMBOSS.
> You can also index these files with the EMBOSS tools dbxfasta or dbxflat
> so you can efficently retrieve individual sequences from the database.
>
> Some more information about the sequence format supported by EMBOSS you
> can find at the emboss documentation pages:
> http://emboss.sourceforge.net/docs/themes/SequenceFormats.html
>
>
> HTH,
> David.
>
> emboss-bounces at lists.open-bio.org schrieb am 06/09/2006 01:54:24:
>
>> Hi,
>>
>>       I try to use DIGEST function in EMBOSS for tryptic digest
>> of protein sequence. The sequence file I download from the following
>> link:
>> ftp://ftp.ebi.ac.
>>
> uk/pub/databases/biomart/current/uniprot_mart_17/uniprot_sequence__sequence__main.
>
>> txt.gz
>>
>>      It is a tab delimited flat file which includes all protein
>> sequences. It seems
>> that it is not any of the formats EMBOSS support. I wonder is it
>> still possible
>> to use DIGEST function?
>
>
> 




More information about the EMBOSS mailing list