[Bioperl-l] extract sequences and save into files by genes

yang liu yang.liu0508 at gmail.com
Sat Feb 25 06:52:05 UTC 2012


Dear colleagues,

I have multiple files named by species name. Each file has ca. 100
different genes. I want to extract the sequences and save them by gene.
In the output file, the gene name would be the species name. How should I
do?

The input file would be like this (with the file name, Acidosasa.txt,
Acorus.txt....)

>rps12
ATGCCAACGGTTAAACAACTTATTAGAAACGCAAGACAGCCAATACGAAATGCTAGAAAATCGCCCGCGC
TTAAGGGATGTCCTCAGCGTCGAGGAACATGTGCTAGGGTGTATACTATCAACCCCAAAAAACCCAACTC
>psbA
TTATCCATTAAGAGATGGAACTTCAAGAACAGCTAGGTCTAGAGGGAAGTTGTGAGCATTACGTTCGTGC
ATTACCTCCATACCAAGATTAGCACGGTTGATGATATCAGCCCAAGTATTAATAACGCGACCTTGGCTAT
.....

I hope the output file to be like this, file name = rps12.txt, psbA.txt....

within rps12.txt, the sequence is like,

>Acidosasa

ATGCCAACGGTTAAACAACTTATTAGAAACGCAAGACAGCCAATACGAAATGCTAGAAAATCGCCCGCGC
TTAAGGGATGTCCTCAGCGTCGAGGAACATGTGCTAGGGTGTATACTATCAACCCCAAAAAACCCAACTC





>Acorus
ATGCCAACTATTAAACAACTTATTAGAAACACAAGACAGCCAATCCGAAATGTC

I do not know if I expressed clearly.

Thanks.



More information about the Bioperl-l mailing list