[Bioperl-l] how to rename genbank header in fasta file?

yang liu yang.liu0508 at gmail.com
Sat Oct 20 05:23:15 UTC 2012


Hello,

I am a new user of BioPerl, can anyone help with this? I have multiple
sequences in a fasta file like the following,

>lcl|NC_014487.1_cdsid_YP_003875479.1 [gene=cox1] [protein=cytochrome c
oxidase subunit 1] [protein_id=YP_003875479.1] [location=1..1575]
ATGACAAATCTGATTCGATGGCTCTTCTCTACTAATCACAAGGATATAGGGACTCTCTATTTCATCTTCG
GCGCCATTGCTGGAGTGATGGGCACATGCTTTTCAGTACTGATTCGTATGGAATTAGCACGCCCCGGCGA
>lcl|NC_014487.1_cdsid_YP_003875480.1 [gene=cox3] [protein=cytochrome c
oxidase subunit 3] [protein_id=YP_003875480.1]
[location=complement(13218..14015)]
ATGATTGAATCTCAACGGCATTCTTTTCATTTGGTAGATCCAAGTCCATGGCCTATTTCGGGTTCACTCG
GAGCTTTGGCAACCACCGTAGGAGGTGTGATGTACATGCACTCATTTCAAGGGGGTGCAACACTTCTCAG

>lcl|NC_014487.1_cdsid_YP_003875481.1 [gene=atp8] [protein=ATPase subunit
8] [protein_id=YP_003875481.1] [location=complement(15042..15548)]
ATGCCTCAACTGGATAAATTTACTTATTTCACACAATTCTTCTGGTCATGCCTTTTTTTCTTTACTTTCT
ATATTCTAATATGCAATGATAGAGATGGAGTACTTGGGATCAGCAGAATTCTAAAACTACGAAATCAACT

I hope to rename the sequences by gene name,such as:

>cox1
ATGACAAATCTGATTCGATGGCTCTTCTCTACTAATCACAAGGATATAGGGACTCTCTATTTCATCTTCG
GCGCCATTGCTGGAGTGATGGGCACATGCTTTTCAGTACTGATTCGTATGGAATTAGCACGCCCCGGCGA
>cox3
ATGATTGAATCTCAACGGCATTCTTTTCATTTGGTAGATCCAAGTCCATGGCCTATTTCGGGTTCACTCG
GAGCTTTGGCAACCACCGTAGGAGGTGTGATGTACATGCACTCATTTCAAGGGGGTGCAACACTTCTCAG

any one can help? Thanks.

Yang.



More information about the Bioperl-l mailing list