[Bioperl-l] generate ptt file from Genbank file
Torsten Seemann
torsten.seemann at infotech.monash.edu.au
Mon Sep 18 00:36:48 UTC 2006
Rafi,
> I am trying to generate a .ptt file like the NCBI ptt file, which basically contains the gene co-ordiante information, its strand, name. I have a Genbank file from which i want to generate this ptt file.
> Is there any BioPerl module which can do the same, or any sample script which I can may be modify and use.
> Thanks in advance for your reply.
I don't think there is any BioPerl script to do it.
And Bio::FeatureIO doesn't support PTT - I will try and add it soon.
Until then, below is a sample script to work with!
Hope it helps,
--Torsten
#!/usr/bin/perl -w
use strict;
use Bio::SeqIO;
# This script takes a GenBank file as input, and produces a
# NCBI PTT file (protein table) as output. A PTT file is
# a line based, tab separated format with fixed column types.
#
# Written by Torsten Seemann
# 18 September 2006
my $gbk = Bio::SeqIO->new(-fh=>\*STDIN, -format=>'genbank');
my $seq = $gbk->next_seq;
my @cds = grep { $_->primary_tag eq 'CDS' } $seq->get_SeqFeatures;
print $seq->description, " - 0..",$seq->length,"\n";
print scalar(@cds)," proteins\n";
print join("\t", qw(Location Strand Length PID Gene Synonym Code COG
Product)),"\n";
for my $f (@cds) {
my $gi = '-';
$gi = $1 if tag($f, 'db_xref') =~ m/\bGI:(\d+)\b/;
my $cog = '-';
$cog = $1 if tag($f, 'product') =~ m/^(COG\S+)/;
my @col = (
$f->start.'..'.$f->end,
$f->strand >= 0 ? '+' : '-',
($f->length/3)-1,
$gi,
tag($f, 'gene'),
tag($f, 'locus_tag'),
$cog,
tag($f, 'product'),
);
print join("\t", @col), "\n";
}
sub tag {
my($f, $tag) = @_;
return '-' unless $f->has_tag($tag);
return join(' ', $f->get_tag_values($tag));
}
More information about the Bioperl-l
mailing list