[Bioperl-l] acquiring a local refseq + index
Hilmar Lapp
hlapp at gmx.net
Sun Dec 31 01:48:33 UTC 2006
Can you send examples and the resulting error messages? Also, I'm
assuming you running the 1.5.2 release of Bioperl; if not that's what
I would try first.
-hilmar
On Dec 30, 2006, at 7:05 PM, Erik wrote:
> Hi all,
>
> I downloaded the refseq files (.gbff) and want to index the lot with
> Bio::DB::Flat.
>
> It turns out that there are many cases where the SOURCE and
> ORGANISM lines
> are messed up, sometimes to a degree where the indexing fails on a
> Bio::SeqIO::genbank error.
>
> I'd like to change Bio::SeqIO::genbank to let this parsing go at
> least so
> far as to make the indexing of the refseq files possible, and
> hopefully
> improving the taxonomic output ($seq->species->binomial is often
> mutilated
> at the moment).
>
> Is it still worthwhile to change parsing modules like
> Bio::SeqIO::genbank?
> Is anyone already working on a rewrite? Because if this is the
> case I may
> be better off writing my own indexing scheme?
>
> Below is (outline of) my indexing program, which uses
> Bio::DB::Flat::DBD.
> If anyone knows of a better way to get a locally searchable refseq
> flat
> file index, I would be very interested.
>
> Thanks for your help,
>
> Erikjan
>
>
> -------------
> use Bio::DB::Flat;
>
> my $refseq_dir = '/data/ftp.ncbi.nih.gov/refseq/release/complete';
> my $db=Bio::DB::Flat->new(
> -directory => $refseq_dir,
> -dbname => 'refseq',
> -format => 'genbank',
> -index => 'bdb',
> -write_flag => 1,
> );
> my @files = getfiles($refseq_dir);
> for my $f (@files) {
> db->build_index($f);
> }
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the Bioperl-l
mailing list