[Bioperl-l] how to rename genbank header in fasta file?
Jason Stajich
jason.stajich at gmail.com
Sat Oct 20 05:43:29 UTC 2012
are you parsing exactly this file - it is in FASTA format not genbank.
You don't need bioperl for this:
perl -i -p -s 's/>.+\[gene=([^\]]+)\].+/>$1/' file.fa
I'd read up on regular expressions and perl to learn more about how to do string replacement to learn how to do this better.
On Oct 19, 2012, at 11:23 PM, yang liu <yang.liu0508 at gmail.com> wrote:
> Hello,
>
> I am a new user of BioPerl, can anyone help with this? I have multiple
> sequences in a fasta file like the following,
>
>> lcl|NC_014487.1_cdsid_YP_003875479.1 [gene=cox1] [protein=cytochrome c
> oxidase subunit 1] [protein_id=YP_003875479.1] [location=1..1575]
> ATGACAAATCTGATTCGATGGCTCTTCTCTACTAATCACAAGGATATAGGGACTCTCTATTTCATCTTCG
> GCGCCATTGCTGGAGTGATGGGCACATGCTTTTCAGTACTGATTCGTATGGAATTAGCACGCCCCGGCGA
>> lcl|NC_014487.1_cdsid_YP_003875480.1 [gene=cox3] [protein=cytochrome c
> oxidase subunit 3] [protein_id=YP_003875480.1]
> [location=complement(13218..14015)]
> ATGATTGAATCTCAACGGCATTCTTTTCATTTGGTAGATCCAAGTCCATGGCCTATTTCGGGTTCACTCG
> GAGCTTTGGCAACCACCGTAGGAGGTGTGATGTACATGCACTCATTTCAAGGGGGTGCAACACTTCTCAG
>
>> lcl|NC_014487.1_cdsid_YP_003875481.1 [gene=atp8] [protein=ATPase subunit
> 8] [protein_id=YP_003875481.1] [location=complement(15042..15548)]
> ATGCCTCAACTGGATAAATTTACTTATTTCACACAATTCTTCTGGTCATGCCTTTTTTTCTTTACTTTCT
> ATATTCTAATATGCAATGATAGAGATGGAGTACTTGGGATCAGCAGAATTCTAAAACTACGAAATCAACT
>
> I hope to rename the sequences by gene name,such as:
>
>> cox1
> ATGACAAATCTGATTCGATGGCTCTTCTCTACTAATCACAAGGATATAGGGACTCTCTATTTCATCTTCG
> GCGCCATTGCTGGAGTGATGGGCACATGCTTTTCAGTACTGATTCGTATGGAATTAGCACGCCCCGGCGA
>> cox3
> ATGATTGAATCTCAACGGCATTCTTTTCATTTGGTAGATCCAAGTCCATGGCCTATTTCGGGTTCACTCG
> GAGCTTTGGCAACCACCGTAGGAGGTGTGATGTACATGCACTCATTTCAAGGGGGTGCAACACTTCTCAG
>
> any one can help? Thanks.
>
> Yang.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org
More information about the Bioperl-l
mailing list