[Bioperl-l] Sequence retrieval from BLAST indexes

Brian Osborne b_i_osborne@hotmail.com
Wed, 20 Mar 2002 10:21:55 -0500


Martin,

> Does Bio::Index::Fasta create the same indexes as the BLAST utility
> "formatdb"?

No, it creates its own index, separate from the *phr's, *pin's, etc. But, if
I understand what you want to do, the result is similar except you'll have
an additional index in the directory, no big deal. One critical difference
is not that fastacmd uses the existing Blast indices but that fastacmd
returns entries as strings and the Bioperl modules will return Seq objects
or arrays of Seq objects. This might be a good thing, but it depends on what
you want to do with the sequences that you retrieve.

May I ask : why is it important that you use the Blast indices? Perhaps I'm
missing something....

Brian O.

----- Original Message -----
From: "Martin Schenker" <Martin.Schenker@ogs.co.uk>
To: "bioperl-l" <bioperl-l@bioperl.org>
Sent: Wednesday, March 20, 2002 9:39 AM
Subject: RE: [Bioperl-l] Sequence retrieval from BLAST indexes


> Hi Brian & Ewan,
>
> now I'm a bit confused.
>
> Does Bio::Index::Fasta create the same indexes as the BLAST utility
> "formatdb"?
> From the module description it doesn't look that way...
>
> I really want to use the BLAST db indexes (like *.phr, *.pin, *.psq etc)
to
> get some seqs back.
> On the command-line , the utility "fastacmd" does that and displays the
> result to STDOUT.
> The only reference to "fastacmd" and BioPerl was in the module from
Bradford
> Powell (1999)
>
> (snip)
> =head1 NAME
>
> Bio::DB::BlastDB - Database object interface to local blast
> databases via fastacmd
>
> =head1 SYNOPSIS
>
>     $db = new Bio::DB::BlastDB;
>
>     $seq = $db->get_Seq_by_id('MUSIGHBA1'); # Unique ID
>
>     # or ...
>
>     $seq = $db->get_Seq_by_acc('J00522'); # Accession Number
>
> =head1 DESCRIPTION
>
> Permits retrieval of sequence data from a file which has
> been prepared for
> blast processing by the ncbi toolkit program 'formatdb'. The
> databases must
> be created with the option '-o T' (see the ncbi toolkit
> docs) This module requires
> the presence of another ncbi program, 'fastacmd', which
> performs the actual
> sequence access.
>
> =head1 FEEDBACK
> ...
>
> (\snip)
>
> So my initial question was, if this module was further developed in
BioPerl
> 1.0.
>
> Any ideas?
>
> Best, Martin
>
>
> > On Tue, 19 Mar 2002, Brian Osborne wrote:
> >
> > > Martin,
> > >
> > > There are a couple of ways to do this in Bioperl v. 1.0. The simplest
> > way is
> > > to use Bio::Index::Fasta, but one of the problem with this approach is
> > that
> > > you might not be able to use the id that's most easily available to
you.
> > An
> > > alternative, Bio::DB::Fasta, has more features and gets around this
> > problem.
> > > Take a look at section III.1.2 of bptutorial.pl as a starting point.
> >
>
>
> **********************************************************************
> The information transmitted by this email is private and
> confidential and is intended for the use of the intended
> recipients specified therein.
> If you are neither an intended recipient nor an employee
> or agent responsible for delivery to an intended recipient,
> you should be aware that any dissemination, distribution
> or copying of this communication is strictly prohibited.
> If you received this communication in error, please
> notify us immediately.
> **********************************************************************
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>