[Bioperl-l] extract sequences and save into files by genes
Cook, Malcolm
MEC at stowers.org
Tue Feb 28 16:55:13 UTC 2012
Yang,
I'm replying back on-list.
You wrote in your other email that my one-liner worked once you learned to run perl from the command line under cygwin.
Great. Glad to help. Good luck. Welcome to the fray!
~Malcolm
From: yang liu [mailto:yang.liu0508 at gmail.com]
Sent: Monday, February 27, 2012 10:04 PM
To: Cook, Malcolm
Subject: Re: [Bioperl-l] extract sequences and save into files by genes
Hello Malcolm,
Thanks for your help. But when I run it, it returned the following line.
'\.txt' is not recognized as an internal or external command, operable program or batch file.
I am using windows 7, is that the problem? I have perl installed.
In windows command, I firstly changed to the folder the target files exist, and then paste your script line.
I am a beginner of perl.
Thanks again for your help.
Yang.
On Mon, Feb 27, 2012 at 10:47 AM, Cook, Malcolm <MEC at stowers.org<mailto:MEC at stowers.org>> wrote:
You don't need bioperl for this one.....
The following perl one liner will do it for you.
perl -p -e 'if (1==$.) {($species = $ARGV) =~ s|\.txt||}; if (s/^>(.*)/">${species}"/e) {$gene=$1; open($O{$gene},qq{>> ${gene}.txt}); select($O{$gene})} ; close ARGV if eof' *.txt
~Malcolm
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org<mailto:bioperl-l-bounces at lists.open-bio.org> [mailto:bioperl-l-<mailto:bioperl-l->
> bounces at lists.open-bio.org<mailto:bounces at lists.open-bio.org>] On Behalf Of yang liu
> Sent: Saturday, February 25, 2012 12:52 AM
> To: bioperl-l at lists.open-bio.org<mailto:bioperl-l at lists.open-bio.org>
> Subject: [Bioperl-l] extract sequences and save into files by genes
>
> Dear colleagues,
>
> I have multiple files named by species name. Each file has ca. 100
> different genes. I want to extract the sequences and save them by gene.
> In the output file, the gene name would be the species name. How should I
> do?
>
> The input file would be like this (with the file name, Acidosasa.txt,
> Acorus.txt....)
>
> >rps12
> ATGCCAACGGTTAAACAACTTATTAGAAACGCAAGACAGCCAATACGAAATGCT
> AGAAAATCGCCCGCGC
> TTAAGGGATGTCCTCAGCGTCGAGGAACATGTGCTAGGGTGTATACTATCAACCC
> CAAAAAACCCAACTC
> >psbA
> TTATCCATTAAGAGATGGAACTTCAAGAACAGCTAGGTCTAGAGGGAAGTTGTG
> AGCATTACGTTCGTGC
> ATTACCTCCATACCAAGATTAGCACGGTTGATGATATCAGCCCAAGTATTAATAAC
> GCGACCTTGGCTAT
> .....
>
> I hope the output file to be like this, file name = rps12.txt, psbA.txt....
>
> within rps12.txt, the sequence is like,
>
> >Acidosasa
>
> ATGCCAACGGTTAAACAACTTATTAGAAACGCAAGACAGCCAATACGAAATGCT
> AGAAAATCGCCCGCGC
> TTAAGGGATGTCCTCAGCGTCGAGGAACATGTGCTAGGGTGTATACTATCAACCC
> CAAAAAACCCAACTC
>
>
>
>
>
> >Acorus
> ATGCCAACTATTAAACAACTTATTAGAAACACAAGACAGCCAATCCGAAATGTC
>
> I do not know if I expressed clearly.
>
> Thanks.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org<mailto:Bioperl-l at lists.open-bio.org>
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list