[Bioperl-l] modify sequence names
Smithies, Russell
Russell.Smithies at agresearch.co.nz
Sun May 20 20:57:24 UTC 2012
Or a Perl inline replace - saves on temp files.
perl -npi -e 's/^>.*\[gene=([^]]+).*$/>$1/'
--Russell
-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Adam Sjøgren
Sent: Sunday, 20 May 2012 3:13 a.m.
To: bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] modify sequence names
On Sat, 19 May 2012 10:34:04 -0400, yang wrote:
> Would anyone please help me to modify sequence names with bioperl? I
> am editing them manually now, is there a easier way?
You don't need BioPerl specifically to do simple text manipulation.
>> lcl|NC_017840.1_cdsid_YP_006280919.1 [gene=cox1] [protein=cytochrome
>> coxidase subunit 1] [protein_id=YP_006280919.1] [location=1..1584]
[... to ...]
>> cox1
Maybe you can use something like:
$ sed 's/^>.*\[gene=\([^]]*\)\].*$/\1/g'
>lcl|NC_017840.1_cdsid_YP_006280919.1 [gene=cox1] [protein=cytochrome coxidase subunit 1] [protein_id=YP_006280919.1] [location=1..1584]
ATGACAAATCCGGTCCGATGGCTGTTCTCCACTAACCACAAGGATATAGGTACTCTATATTTCATCTTCG
GTGCCATTGCTGGAGTGATGGGCACATGCTTCTCAGTACTGATTCGTATGGAATTAGCACGACCCGGCGA
TCAAATTCTTGGTGGGAATCATCAACTTTATAATGTTTTAATAACGGCTCACGCTTTTTTAATGATCTTT
cox1
ATGACAAATCCGGTCCGATGGCTGTTCTCCACTAACCACAAGGATATAGGTACTCTATATTTCATCTTCG
GTGCCATTGCTGGAGTGATGGGCACATGCTTCTCAGTACTGATTCGTATGGAATTAGCACGACCCGGCGA
TCAAATTCTTGGTGGGAATCATCAACTTTATAATGTTTTAATAACGGCTCACGCTTTTTTAATGATCTTT
$
If you need to use Perl rather than sed, you can use:
$ perl -pe 's/^>.*\[gene=([^]]+).*$/>$1/'
instead.
The easiest way is probably to learn a little programming and/or regular expressions.
Learning Perl by Randal L. Schwartz, brian d foy, and Tom Phoenix could be a starting point, so could many online tutorials.
Best regards,
Adam
--
"Hur långt man än har kommit Adam Sjøgren
är det alltid längre kvar" asjo at koldfront.dk
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
More information about the Bioperl-l
mailing list