[Bioperl-l] Re: help extracting CDS

Wed, 18 Dec 2002 09:55:04 -0600

Hi, the ways of programing with bioperl remains misterious to me.
Despite the always excellent help of Jason Stajich I could not find
simple way to get the CDS from a genbank record using bioperl. At least
for now. 
Anyway, in case someone was interested in my original post, this code in
straight perl does just that.
Regards.
#!/usr/sbin/perl -w
use strict;
$/ = "\n     CDS";
<>;	# to skeep header
while ( <> ) {
    my ($gname) = /product="([^"]+)"/;#sometimes /product= is replace by
/name=
    $gname      =~ s/\s+//g;
    my ($ref)   = /protein_id="([\w.]+)"/;
    my ($gid)   = /db_xref="(GI:\w+)"/;
    my ($seq)   = /translation="([A-Z\s]+)"/;
    $seq        =~ s/\s+//g;

    print ">$gid|$gname|$ref\n$seq\n";
}
> 
> Hi, I need to extract the CDS from a genbank genome record, saving them
> into file in  fasta format, and I wonder if someone can let me know how
> to do this using bioperl.
> Tanks in advance for any positive consideration.
> 
> pedro
> 
> *******************************************************************
> PEDRO A. RECHE , pHD            TL: 617 632 3824
> Dana-Farber Cancer Institute,   FX: 617 632 4569
> Harvard Medical School,         EM: reche@research.dfci.harvard.edu
> 44 Binney Street, D1510A,       EM: reche@mifoundation.org
> Boston, MA 02115                URL: http://www.reche.org
> *******************************************************************
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l

-- 
*******************************************************************
PEDRO A. RECHE , pHD		TL: 617 632 3824
Dana-Farber Cancer Institute,	FX: 617 632 4569
Harvard Medical School,		EM: reche@research.dfci.harvard.edu
44 Binney Street, D1510A,	EM: reche@mifoundation.org		
Boston, MA 02115		URL: http://www.reche.org   						
*******************************************************************