[Bioperl-l] Help with Bio::SeqIO
download on demand
downloadondemand at gmail.com
Sun Nov 4 18:39:42 UTC 2007
Hi to all.
I have a problem with a simplest script:
use Bio::SeqIO;
# get command-line arguments, or die with a usage statement
my $usage = "x2y.pl infile infileformat outfile outfileformat\n";
my $infile = shift or die $usage;
my $infileformat = shift or die $usage;
# my $outfile = shift or die $usage;
my $outfileformat = shift or die $usage;
# create one SeqIO object to read in,and another to write out
my $seq_in = Bio::SeqIO->new('-file' => "<$infile",
'-format' => $infileformat);
my $seq_out = Bio::SeqIO->new('-fh' => \*STDOUT,
'-format' => $outfileformat);
# write each entry in the input file to the output file
while (my $inseq = $seq_in->next_seq) {
# $seq_out->write_seq($inseq); # Whole sequence not needed
for my $feat_object ($inseq->get_SeqFeatures)
{
if ($feat_object->primary_tag eq "CDS")
{
print $feat_object->get_tag_values('product'),"\n";
print
$feat_object->location->start,"..",$feat_object->location->end,"\n";
print $feat_object->spliced_seq->seq,"\n\n";
}
}
The result seems OK to me, but in case of first CDS of NC_005213.gbk from
here <ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Nanoarchaeum_equitans/> the
output is wrong:
It is:
hypothetical protein
1..490885
TAAATGCGATTGCTATTAGAA..................................Truncated
sequence...................................
Should be:
hypothetical protein
879..490883
ATGCGATTGCTATTAGAA...................................Truncated
sequence....................................TAA
This CDS have an unnatural location string:
CDS complement(join(490883..490885,1..879)), but spliced_seq
should handle these things?
Please help me!
Best regards, N.
More information about the Bioperl-l
mailing list