[Bioperl-l] BLASTing BioSeq objects
Jason Stajich
jason@cgt.mc.duke.edu
Tue, 10 Dec 2002 16:47:08 -0500 (EST)
This works for me:
#!/usr/bin/perl -w
use Bio::Tools::Run::StandAloneBlast;
use IO::String;
use Bio::SeqIO;
my $strdata = <<EOF
>TEST
MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDHGWWKQH
YEWRGNRWHLHGPPPPPRHHKKAPHDHHGGHGPGKHHR
EOF
;
my $iostring = IO::String->new($strdata);
my $seqin = Bio::SeqIO->new(-fh => $iostring, -format => 'fasta');
my $seq = $seqin->next_seq;
my $factory = Bio::Tools::Run::StandAloneBlast->new('database' => 'swissprot',
'program' => 'blastp');
my $report = $factory->blastall($seq);
while( my $r = $report->next_result ) {
print $r->query_name(), "\n";
while( my $h = $r->next_hit ) {
print $h->name(), " ", $h->significance(), "\n";
}
}
[jason@sonogno test]$ perl standaloneblast.pl
TEST
gi|6686173|sp|P75616|YAAX_ECOLI 9e-17
gi|6686246|sp|P76527|YPEC_ECOLI 2e-07
-jason
On Tue, 10 Dec 2002, James Wasmuth wrote:
> Sorry to fill every1's intrays, but this has bugged me for a while and
> has caused me to miss my bus :o(
>
> rigtheo, I manged to get this far thru a few other methods, but Shawn's
> was the most concise.
>
> So my very rough code looks like...
>
> sub CREATEBIOSEQ {
> my $query_str = $_[0]; #query sequence in fasta
> my $stringfh = new IO::String($query_str);
> my $sio = new Bio::SeqIO(-fh => $stringfh, -format => 'fasta');
> my $bioSeq_obj = $sio->next_seq; # create Bio::Seq object
> return $bioSeq_obj;
> }
>
> $bioSeq_obj has size (checked it using length). However when I place it
> into Blast later I get...
>
> MSG: Bio::Seq=HASH(0x8620500) (0) not Bio::Seq object or array of
> Bio::Seq objects or file name!
>
> Have tried all the checks I can and the object is the same one all the
> way thru the process, but something isn't recognising it!
> If its nothing obvious then I'll have a hunt round the rest of my code
> and see if it comes from there...
>
> Thanx
> J
>
>
>
> Shawn wrote:
>
> >actually, $sio->next_seq is a handle that returns a Bio::Seq object.
> >so u can pass what it returns directly to blast.
> >
> >my $fact = Bio::Tools::Run::StandAloneBlast->new('program'=>'blastn',
> > 'database'=>"mydb",
> > 'outfile'=>'tmp/xxx');
> >
> >
> >my $stringfh = new IO::String($query_str);
> >my $sio = new Bio::SeqIO(-fh => $stringfh, -format => 'fasta');
> >$fact->blastall($sio->next_seq);
> >
> >
> >
> >On Wed, 2002-12-11 at 01:08, James Wasmuth wrote:
> >
> >
> >>That would give me a SeqIO object, Blast requires a Bio::Seq object. So
> >>I would then need to convert $seq to a Bio::Seq object? If so, this is
> >>fine, and thanks.
> >>James
> >>
> >>Shawn wrote:
> >>
> >>
> >>
> >>>oh ok, misread you. Your original solution was almost there.
> >>>this works for me:
> >>>
> >>>use Bio::SeqIO;
> >>>use IO::String;
> >>>my $query_str = ">A\nACCCCCCC";
> >>>
> >>>
> >
> >
> >
> >>>my $stringfh = new IO::String($query_str);
> >>>my $sio = new Bio::SeqIO(-fh => $stringfh, -format => 'fasta');
> >>>my $seq = $sio->next_seq;
> >>>
> >>>print $seq->seq;
> >>>
> >>>
> >>>shawn
> >>>
> >>>On Wed, 2002-12-11 at 00:38, James Wasmuth wrote:
> >>>
> >>>
> >>>
> >>>
> >>>>>my $seq = new Bio::Seq(-seq=>$query_str,-id=>"blast_seq");
> >>>>>my $fact = Bio::Tools::Run::StandAloneBlast->new('program'=>'blastn',
> >>>>> 'database'=>"mydb",
> >>>>> 'outfile'=>'tmp/xxx');
> >>>>>my $blast_report = $fact->blastall($seq);
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>Unfortunately I've tried something along these lines. The problem seems
> >>>>to caused by the query sequence being in fasta format. In the error log
> >>>>I get a message...
> >>>>
> >>>>"MSG: Attempting to set the sequence to [>GLBH_CAEEL
> >>>><--some aa-->] which does not look healthy"
> >>>>
> >>>>I know that this is because entering a sequence using '-seq' it has to
> >>>>be raw and so have no fasta header or "\n" characters, so I am trying to
> >>>>get round this. I've done it by storing the query sequence as a file
> >>>>and using this to blast with but had been hoping to avoid this...
> >>>>
> >>>>James
> >>>>
> >>>>
> >>>>
> >>>>Shawn wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>You probably want to do:
> >>>>>
> >>>>>
> >>>>>cheers,
> >>>>>
> >>>>>shawn
> >>>>>
> >>>>>On Tue, 2002-12-10 at 23:40, James Wasmuth wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>Hi all,
> >>>>>>
> >>>>>>I'm trying to blast a sequence which is introduced to the program as a
> >>>>>>string all implemented as a CGI userinterface. The sequence is in fasta
> >>>>>>format, though I hope to extend this to any.
> >>>>>>
> >>>>>>I thought I had been able to create a Bio::Seq object from the string,
> >>>>>>which StandAloneBlast requires, by doing...
> >>>>>>
> >>>>>>my $stringfh = new IO::String($query_str);
> >>>>>>my $bioSeq_obj = new Bio::Seq(-fh => $stringfh, -format => 'fasta');
> >>>>>>
> >>>>>>however a check on the length of the sequence in the object reveals it
> >>>>>>to be zero in length.
> >>>>>>
> >>>>>>Anyone any ideas? Should I first create a SeqIO object and convert this
> >>>>>>to a Bio::Seq object and then BLAST. If so, how is this conversion done?
> >>>>>>
> >>>>>>I'm sure the answers are in the archive but have been unable to locate
> >>>>>>them...
> >>>>>>
> >>>>>>Many Thanks
> >>>>>>James
> >>>>>>
> >>>>>>_______________________________________________
> >>>>>>Bioperl-l mailing list
> >>>>>>Bioperl-l@bioperl.org
> >>>>>>http://bioperl.org/mailman/listinfo/bioperl-l
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>--
> >>>>
> >>>>Nematode Bioinformatics
> >>>>Blaxter Nematode Genomics Group
> >>>>Institute of Cell, Animal and Population Biology
> >>>>Ashworth Labs
> >>>>University of Edinburgh
> >>>>King's Buildings
> >>>>Edinburgh
> >>>>EH9 3JT
> >>>>
> >>>>0131 650 7403
> >>>>
> >>>>
> >>>>
> >>>>_______________________________________________
> >>>>Bioperl-l mailing list
> >>>>Bioperl-l@bioperl.org
> >>>>http://bioperl.org/mailman/listinfo/bioperl-l
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>--
> >>
> >>Nematode Bioinformatics
> >>Blaxter Nematode Genomics Group
> >>Institute of Cell, Animal and Population Biology
> >>Ashworth Labs
> >>University of Edinburgh
> >>King's Buildings
> >>Edinburgh
> >>EH9 3JT
> >>
> >>0131 650 7403
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
> >
>
>
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu