[Bioperl-l] Problem in parsing GenBank flatfile

Yoshida Yuichi yuichiyoshi at gmail.com
Thu Jun 9 05:39:08 EDT 2005


Dear all,

I am trying to parse GenBank flatfile (accession num is NT_015926) 
by calling Bio::SeqIO modules, but I can not.

- - - - - - - - - - - - - Perl program code - - - - - - - - - - - - -
#!/usr/bin/perl
use Bio::SeqIO;

$gbk_filename = shift @ARGV;
$seqin = Bio::SeqIO->new(-file=>$gbk_filename, -format=>'Genbank');

while ($seqobj = $seqin->next_seq) {
    $accession = $seqobj->accession_number,"\n";
    foreach my $feat ($seqobj->get_SeqFeatures()){
        if ($feat->primary_tag eq 'mRNA'){
            $db_gene_name = join(' ',$feat->get_tag_values('gene'));
            $db_transcript_id = join('
',$feat->get_tag_values('transcript_id'));
            $start = $feat->start; $end = $feat->end;
            print $db_transcript_id,"\t",$db_gene_name,"\t",$accession,"\t";
            print $start,"\t",$end,"\n";
        }
    }
}
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

The following error message is shown.

- - - - - - - - - - - - - error message - - - - - - - - - - - - -
-------------------- WARNING ---------------------
MSG: cannot see new qualifier in feature CDS: aa:OTHER)
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: cannot see new qualifier in feature CDS: aa:OTHER)
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: cannot see new qualifier in feature CDS: aa:OTHER)
---------------------------------------------------
out of memory
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

The parts which cause the error (I guess) is shown as the followings.

- - - - - GenBank partial flatfile (NT_015926) - - - - -
     CDS             complement(join(4528741..4528932,4543408..4543490,
                     4581809..4582043,4616648..4616817,4632093..4632236,
                     4643148..4643301))
                     /gene="FLJ21820"
                     /note="go_function: catalytic activity [goid 0003824]
                     [evidence IEA];
                     go_process: lipid metabolism [goid 0006629] [evidence
                     IEA]"
                     /codon_start=1
                     /product="hypothetical protein FLJ21820"
                     /protein_id="NP_068744.1"
                     /db_xref="GI:11345458"
                     /db_xref="GeneID:60526"
                     /db_xref="LocusID:60526"
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Would you please tell me the way to solve this problem?

--
Yuichi Yoshida <yuichiyoshi at gmail.com>



More information about the Bioperl-l mailing list