[Bioperl-l] BerkeleyDB
Peter Cock
p.j.a.cock at googlemail.com
Fri Jan 25 16:13:52 UTC 2019
That's a good question - I don't know the BioPerl answer,
but am interested from the Biopython side of things.
When I created Biopython's SeqIO (first included in
Biopython 1.43 from 2007) it was heavily influenced by
BioPerl's SeqIO:
https://bioperl.org/howtos/SeqIO_HOWTO
https://biopython.org/wiki/SeqIO
The older Biopython framework it replaced (using a regular
expression based system called Martel/Mindy) had indexing,
e.g. see the Biopython 1.30 release notes from 2004.
It took a bit longer to add indexing to Biopython's SeqIO.
I added in-memory indexing (using a dict or hash Perl
terminology) in Biopython 1.52 (2009), and then SQLite
support was added in Biopython 1.57 (2011). And yes, a
key point of this was to build an index once, and reuse it.
I did look at BerkeleyDB for this, but concluded that
SQLite was a more portable and practical choice - it
was usually included with a standard Python install.
Regards,
Peter
On Fri, Jan 25, 2019 at 3:18 PM shalu sharma <sharmashalu.bio at gmail.com> wrote:
>
> Hey everyone,
> So I am using this BerkeleyDB to make a huge database (tree method).
> I use it to pull out matching ids (its working fine) from multiple datasets.
> here are few lines of the code:
>
> use strict ;
>
> use BerkeleyDB ;
>
> use Bio::SeqIO;
>
> my $filename = "tree" ;
>
> unlink $filename ;
>
> my %h ;
>
> tie %h, 'BerkeleyDB::Btree',
>
> -Filename => $filename,
>
> -Flags => DB_CREATE,
>
> or die "Cannot open $filename: $!\n" ;
>
>
> # Add a key/value pair to the file
>
> open(IN,"$ARGV[0]"); # adding values
>
> while(<IN>){
>
> my $line = $_;
>
> chomp($line);
>
> my @f = split('\t',$line);
>
> my $id = $f[0];my $val = $f[1];$id =~ s/^\s+//;$id =~ s/\s+$//;
>
> $val =~ s/^\s+//;$val =~ s/\s+$//;
>
> $h{$id} = $val;
>
> ----
> ----
> My question is that: It makes a huge tree file. Is it possible to re-use that tree file again instead of making it again and again. My query datasets changes but not that database.
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list