[Bioperl-l] genbank2gff.pl choking on CONTIG sections
Jason Stajich
jason at bioperl.org
Wed Sep 24 23:05:18 UTC 2008
It should already if it is using Bio::DB::GenBank -- do you have
example of a fail? There seems to be some defaulting to EMBL for the
source in the biofetch code so it might be worth twiddling.
from Bio::DB::GenBank
Note that when querying for GenBank accessions starting with 'NT_' you
will need to call $gb->request_format('fasta') beforehand, because
in GenBank format (the default) the sequence part will be left out
(the reason is that NT contigs are rather annotation with references
to clones).
Some work has been done to automatically detect and retrieve whole NT_
clones
when the data is in that format (NCBI RefSeq clones). The former
behavior prior
to bioperl 1.6 was to retrieve these from EBI, but now these are
retrieved
directly from NCBI. The older behavior can be regained by setting the
'redirect_refseq' flag to a value evaluating to TRUE.
On Sep 24, 2008, at 3:00 PM, Scott Cain wrote:
> Hi all,
>
> The BioPerl script bp_genbank2gff.pl, which will either convert a
> Genbank record to GFF or load it directly to a Bio::DB::GFF database,
> is choking on GenBank records with CONTIG sections. Since I don't
> think these would ever be useful for generating GFF or loading into a
> database (ie, the user will want to get all of the features on the
> parts, not know what the parts are), is there a way to force a
> Bio::DB::WebDBSeqI/Bio::DB::BioFetch to get the full record (like
> specifying view=gbwithparts in the url at ncbi)?
>
> Thanks,
> Scott
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D. cain.cshl at gmail.com
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Cold Spring Harbor Laboratory
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Jason Stajich
jason at bioperl.org
More information about the Bioperl-l
mailing list