[Bioperl-l] Re: help extracting CDS
Pedro Antonio Reche
reche@research.dfci.harvard.edu
Wed, 18 Dec 2002 09:55:04 -0600
Hi, the ways of programing with bioperl remains misterious to me.
Despite the always excellent help of Jason Stajich I could not find
simple way to get the CDS from a genbank record using bioperl. At least
for now.
Anyway, in case someone was interested in my original post, this code in
straight perl does just that.
Regards.
#!/usr/sbin/perl -w
use strict;
$/ = "\n CDS";
<>; # to skeep header
while ( <> ) {
my ($gname) = /product="([^"]+)"/;#sometimes /product= is replace by
/name=
$gname =~ s/\s+//g;
my ($ref) = /protein_id="([\w.]+)"/;
my ($gid) = /db_xref="(GI:\w+)"/;
my ($seq) = /translation="([A-Z\s]+)"/;
$seq =~ s/\s+//g;
print ">$gid|$gname|$ref\n$seq\n";
}
>
> Hi, I need to extract the CDS from a genbank genome record, saving them
> into file in fasta format, and I wonder if someone can let me know how
> to do this using bioperl.
> Tanks in advance for any positive consideration.
>
> pedro
>
> *******************************************************************
> PEDRO A. RECHE , pHD TL: 617 632 3824
> Dana-Farber Cancer Institute, FX: 617 632 4569
> Harvard Medical School, EM: reche@research.dfci.harvard.edu
> 44 Binney Street, D1510A, EM: reche@mifoundation.org
> Boston, MA 02115 URL: http://www.reche.org
> *******************************************************************
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
--
*******************************************************************
PEDRO A. RECHE , pHD TL: 617 632 3824
Dana-Farber Cancer Institute, FX: 617 632 4569
Harvard Medical School, EM: reche@research.dfci.harvard.edu
44 Binney Street, D1510A, EM: reche@mifoundation.org
Boston, MA 02115 URL: http://www.reche.org
*******************************************************************