[Bioperl-l] BLASTing BioSeq objects

Jason Stajich jason@cgt.mc.duke.edu
Tue, 10 Dec 2002 16:47:08 -0500 (EST)


This works for me:

#!/usr/bin/perl -w

use Bio::Tools::Run::StandAloneBlast;
use IO::String;
use Bio::SeqIO;

my $strdata = <<EOF
>TEST
MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDHGWWKQH
YEWRGNRWHLHGPPPPPRHHKKAPHDHHGGHGPGKHHR
EOF
;
my $iostring = IO::String->new($strdata);
my $seqin    = Bio::SeqIO->new(-fh => $iostring, -format => 'fasta');

my $seq = $seqin->next_seq;

my $factory = Bio::Tools::Run::StandAloneBlast->new('database' => 'swissprot',
						    'program'  => 'blastp');
my $report = $factory->blastall($seq);

while( my $r = $report->next_result ) {
    print $r->query_name(), "\n";
    while( my $h = $r->next_hit ) {
	print $h->name(), " ", $h->significance(), "\n";
    }
}
[jason@sonogno test]$ perl standaloneblast.pl
TEST
gi|6686173|sp|P75616|YAAX_ECOLI 9e-17
gi|6686246|sp|P76527|YPEC_ECOLI 2e-07


-jason

On Tue, 10 Dec 2002, James Wasmuth wrote:

> Sorry to fill every1's intrays, but this has bugged me for a while and
> has caused me to miss my bus  :o(
>
> rigtheo, I manged to get this far thru a few other methods, but Shawn's
> was the most concise.
>
> So my very rough code looks like...
>
> sub CREATEBIOSEQ    {
>     my $query_str = $_[0];          #query sequence in fasta
>    my $stringfh = new IO::String($query_str);
>     my $sio = new Bio::SeqIO(-fh => $stringfh, -format => 'fasta');
>     my $bioSeq_obj = $sio->next_seq;  # create Bio::Seq object
>     return $bioSeq_obj;
> }
>
> $bioSeq_obj has size (checked it using length).  However when I place it
> into Blast later I get...
>
> MSG:  Bio::Seq=HASH(0x8620500) (0) not Bio::Seq object or array of
> Bio::Seq objects or file name!
>
> Have tried all the checks I can and the object is the same one all the
> way thru the process, but something isn't recognising it!
> If its nothing obvious then I'll have a hunt round the rest of my code
> and see if it comes from there...
>
> Thanx
> J
>
>
>
> Shawn wrote:
>
> >actually, $sio->next_seq is a handle that returns a Bio::Seq object.
> >so u can pass what it returns directly to blast.
> >
> >my $fact = Bio::Tools::Run::StandAloneBlast->new('program'=>'blastn',
> >	                                        'database'=>"mydb",
> >		                               'outfile'=>'tmp/xxx');
> >
> >
> >my $stringfh = new IO::String($query_str);
> >my $sio = new Bio::SeqIO(-fh => $stringfh, -format => 'fasta');
> >$fact->blastall($sio->next_seq);
> >
> >
> >
> >On Wed, 2002-12-11 at 01:08, James Wasmuth wrote:
> >
> >
> >>That would give me a SeqIO object, Blast requires a Bio::Seq object.  So
> >>I would then need to convert $seq to a Bio::Seq object?  If so, this is
> >>fine, and thanks.
> >>James
> >>
> >>Shawn wrote:
> >>
> >>
> >>
> >>>oh ok, misread you. Your original solution was almost there.
> >>>this works for me:
> >>>
> >>>use Bio::SeqIO;
> >>>use IO::String;
> >>>my $query_str = ">A\nACCCCCCC";
> >>>
> >>>
> >
> >
> >
> >>>my $stringfh = new IO::String($query_str);
> >>>my $sio = new Bio::SeqIO(-fh => $stringfh, -format => 'fasta');
> >>>my $seq = $sio->next_seq;
> >>>
> >>>print $seq->seq;
> >>>
> >>>
> >>>shawn
> >>>
> >>>On Wed, 2002-12-11 at 00:38, James Wasmuth wrote:
> >>>
> >>>
> >>>
> >>>
> >>>>>my $seq = new Bio::Seq(-seq=>$query_str,-id=>"blast_seq");
> >>>>>my $fact = Bio::Tools::Run::StandAloneBlast->new('program'=>'blastn',
> >>>>>	                                        'database'=>"mydb",
> >>>>>		                               'outfile'=>'tmp/xxx');
> >>>>>my $blast_report = $fact->blastall($seq);
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>Unfortunately I've tried something along these lines.  The problem seems
> >>>>to caused by the query sequence being in fasta format.  In the error log
> >>>>I get a message...
> >>>>
> >>>>"MSG: Attempting to set the sequence to [>GLBH_CAEEL
> >>>><--some aa-->] which does not look healthy"
> >>>>
> >>>>I know that this is because entering a sequence using '-seq' it has to
> >>>>be raw and so have no fasta header or "\n" characters, so I am trying to
> >>>>get round this.  I've done it by storing the query sequence as a file
> >>>>and using this to blast with but had been hoping to avoid this...
> >>>>
> >>>>James
> >>>>
> >>>>
> >>>>
> >>>>Shawn wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>You probably want to do:
> >>>>>
> >>>>>
> >>>>>cheers,
> >>>>>
> >>>>>shawn
> >>>>>
> >>>>>On Tue, 2002-12-10 at 23:40, James Wasmuth wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>Hi all,
> >>>>>>
> >>>>>>I'm trying to blast a sequence which is introduced to the program as a
> >>>>>>string all implemented as a CGI userinterface.  The sequence is in fasta
> >>>>>>format, though I hope to extend this to any.
> >>>>>>
> >>>>>>I thought I had been able to create a Bio::Seq object from the string,
> >>>>>>which StandAloneBlast requires,  by doing...
> >>>>>>
> >>>>>>my $stringfh = new IO::String($query_str);
> >>>>>>my $bioSeq_obj = new Bio::Seq(-fh => $stringfh, -format => 'fasta');
> >>>>>>
> >>>>>>however a check on the length of the sequence in the object reveals it
> >>>>>>to be zero in length.
> >>>>>>
> >>>>>>Anyone any ideas?  Should I first create a SeqIO object and convert this
> >>>>>>to a Bio::Seq object and then BLAST.  If so, how is this conversion done?
> >>>>>>
> >>>>>>I'm sure the answers are in the archive but have been unable to locate
> >>>>>>them...
> >>>>>>
> >>>>>>Many Thanks
> >>>>>>James
> >>>>>>
> >>>>>>_______________________________________________
> >>>>>>Bioperl-l mailing list
> >>>>>>Bioperl-l@bioperl.org
> >>>>>>http://bioperl.org/mailman/listinfo/bioperl-l
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>--
> >>>>
> >>>>Nematode Bioinformatics
> >>>>Blaxter Nematode Genomics Group
> >>>>Institute of Cell, Animal and Population Biology
> >>>>Ashworth Labs
> >>>>University of Edinburgh
> >>>>King's Buildings
> >>>>Edinburgh
> >>>>EH9 3JT
> >>>>
> >>>>0131 650 7403
> >>>>
> >>>>
> >>>>
> >>>>_______________________________________________
> >>>>Bioperl-l mailing list
> >>>>Bioperl-l@bioperl.org
> >>>>http://bioperl.org/mailman/listinfo/bioperl-l
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>--
> >>
> >>Nematode Bioinformatics
> >>Blaxter Nematode Genomics Group
> >>Institute of Cell, Animal and Population Biology
> >>Ashworth Labs
> >>University of Edinburgh
> >>King's Buildings
> >>Edinburgh
> >>EH9 3JT
> >>
> >>0131 650 7403
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
> >
>
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu