[EMBOSS] backtranseq

Andres Pinzon andrespinzon at gmail.com
Fri Dec 19 21:45:56 UTC 2008


Hi Magdy,
This is Andres, please dont blame peter!!! for the script I wrote it :-)

On Fri, Dec 19, 2008 at 3:58 PM, Magdy Alabady <malabady at gmail.com> wrote:
> Thanks Peter, it works well except it produces so many files. the number of
> files is 4 or 5 times more than the number of sequences x2. for example, if
> I have 10 sequences, I would expected 21 files produced: 10 protein
> sequences, 10 backtranslation files, and one concatenated file. Is this
> correct:

lets say you have 1 multiple fasta file whith 3 entries.
 It will create 7 files. 3  single fasta files (one for every multiple
fasta enty). 3 ".bt",
one for each single fasta file and 1 file which concatenates this
".bt" files. (thjis is the way it works in my machine). Please take
into account that if you run it several times on the same place it
will multiply your files.


> one other thing, forgive me if it bad question,  in the first line of the
> script isn't the "-outseq 1.fasta" should be "-outseq $1.fasta"

nop, this option says seqret how to name the output files. So the
first entry in the multiple fasta file will correspond to the 1.fasta
file, the second one will correspond to 2.fasta and so on.
 I hope this helps,

Please don hesitate to contact me.

Best,
> thanks
>
> On Fri, Dec 19, 2008 at 10:56 AM, Andres Pinzon <andrespinzon at gmail.com>
> wrote:
>>
>> This is a not too elegant solution to your problem:
>> paste the following code in a file called "script.sh":
>> ======================================
>>
>>
>> #!/bin/bash
>>
>>        seqret -ossingle -sequence $1 -outseq 1.fasta
>>
>>
>>        for i in $( ls ); do
>>                if [ "$i" != "script.sh" ]; then
>>                        echo Processing: $i
>>                        backtranseq -sequence $i -outfile $i.bt
>>                fi
>>        done
>>
>>        cat *.bt > $1.backtranslated.fasta
>>
>> =================================================
>>
>> and run the script as this:
>> ==================================================
>> ./script.sh    ../yourMultipleFastaFile
>> ==================================================
>>
>>
>> This script will take your multiple fasta file, will create single
>> files from it. will run backtranslate on each of them and will create
>> an output file with your results called:
>> yourMultipleFastaFile_backtranslated.fasta
>>
>>
>> Hope it helps.
>>
>> regards,
>>
>>
>>
>> On Thu, Dec 18, 2008 at 6:20 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>> > Magdy Alabady wrote:
>> >>
>> >> Hello all,
>> >>
>> >> Can backtranseq run input file contains several 100's of protein
>> >> sequences
>> >> in Fasta format? can it make bulk back translation? please tell me how
>> >> to
>> >> do
>> >> so if it is possible
>> >
>> > Backtranseq only processes a single sequence. You have several choices:
>> >
>> > 1. make a new version of backtranseq that can process multiple sequences
>> > (easy to do ... but you need to learn a little EMBOSS programming first
>> > :-)
>> >
>> > 2. put your sequence file in a directory and name it with anything other
>> > than a .fasta extension.
>> >
>> > use seqret -ossingle to convert your file into 100s of single sequence
>> > files
>> >
>> > run backtranseq on each of the individual files.
>> >
>> >
>> > I wonder ... do you really want 100s of backtranslations in a single
>> > file?
>> >
>> > regards,
>> >
>> > Peter Rice
>> > _______________________________________________
>> > EMBOSS mailing list
>> > EMBOSS at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/emboss
>> >
>>
>>
>>
>> --
>> Andrés Pinzón
>> http://bioinf.ibun.unal.edu.co/~apinzon/
>> Bioinformatics Center, Colombia EMBnet node
>> http://bioinf.ibun.unal.edu.co
>> Tel +57 3165000 ext 16961 Fax +571 3165415
>> Micology and Phytopathology Laboratory - Los Andes University.
>> http://bioinf.uniandes.edu.co
>> Tel +571 3394949 ext. 2768
>
>
>
> --
> Magdy S. Alabady, PhD
> ------------------------------------------------------
> If A is a success in life, then A equals x plus y plus z. Work is x; y is
> play; and z is keeping your mouth shut. .....Albert Einstein
> -------------------------------------------------------------
>
>



-- 
Andrés Pinzón
http://bioinf.ibun.unal.edu.co/~apinzon/
Bioinformatics Center, Colombia EMBnet node
http://bioinf.ibun.unal.edu.co
Tel +57 3165000 ext 16961 Fax +571 3165415
Micology and Phytopathology Laboratory - Los Andes University.
http://bioinf.uniandes.edu.co
Tel +571 3394949 ext. 2768




More information about the EMBOSS mailing list