[Bioperl-l] StandAloneBlast->blastall array of Bio::Seq objects
Andrew Stewart
stewarta at nmrc.navy.mil
Thu Dec 14 21:23:07 UTC 2006
> It was a shot in the dark, really. The fact that the return status
> was bad could be due to a number of problems (permissions issues,
> bad data, etc). The fact that a single sequence worked indicated
> that permissions and output format likely weren't to blame. The
> only other thing left was a problem with blastall itself.
>
> BTW, the blast docs do not indicate whether there is a maximum
> number of sequences. There may be a point where available memory
> becomes the limiting issue.
>
> chris
Interesting. I ran the 738-sequence dataset through blastall
manually and the report only returned 198 of the 738 expected
results. Not only that, it seems to have just cut off right in the
middle of the 198th result and a Segmentation fault was reported. I
removed the 198th sequence, wondering if it might be some issue with
the input, and the segmentation fault occured again with the results
ending on the 210th result. I stuck the 198th sequence back in, but
at the start of the file and sure enough the Segmentation error
occurred earlier. I think we can rule out the size of the input or
number of sequences as the source of error here. I'm more inclined
to think it has something to do with the blast databases being
queried against.
I found an old discussion on a problem that sounds fairly similar to
this one, for anyone interested.
http://bioinformatics.org/pipermail/bioclusters/2004-June/001742.html
I think I'll try to work around the problem for now.
andrew
On Dec 14, 2006, at 1:36 PM, Chris Fields wrote:
>
> On Dec 14, 2006, at 11:49 AM, Andrew Stewart wrote:
>
>>> So can you look at the tempfile that is created and see if it is
>>> sane?
>>>
>>> Set -save_tempfiles => 1 whene you initialize the factory object
>>> or do
>>> $factory->save_tempfiles(1)
>>> before calling the blastall.
>>>
>>> -jason
>>>
>>
>> Jason,
>> I was actually wondering how to do that. Thanks. Odd though, it
>> still doesn't seem to be saving the tempfiles. Might not matter
>
> That needs to be checked out. Can anyone verify that?
>
>>> The error pops up when the executable returns a bad status, so
>>> maybe it's choking on too many input sequences (i.e. Bioperl is
>>> doing everything correctly, but you are attempting to BLAST too
>>> many sequences in one go). How many sequences are you attempting
>>> to use as input? What happens when you use fewer input sequences?
>>>
>>> chris
>>>
>>
>> I was processing 738 sequences for input. I cut that down to 20
>> sequences and I'm getting some other exception thrown further
>> downstream, so it appears you may be correct. You don't happen to
>> know what the max number of sequences that blastall allows for input,
>> would ya? ;) I suppose I'll have to break @query down into smaller
>> doses or something.
>>
>> Thanks,
>> Andrew
>
> It was a shot in the dark, really. The fact that the return status
> was bad could be due to a number of problems (permissions issues,
> bad data, etc). The fact that a single sequence worked indicated
> that permissions and output format likely weren't to blame. The
> only other thing left was a problem with blastall itself.
>
> BTW, the blast docs do not indicate whether there is a maximum
> number of sequences. There may be a point where available memory
> becomes the limiting issue.
>
> chris
--
Andrew Stewart
Research Assistant, Genomics Team
Navy Medical Research Center (NMRC)
Biological Defense Research Directorate (BDRD)
BDRD Annex
12300 Washington Avenue, 2nd Floor
Rockville, MD 20852
email: stewarta at nmrc.navy.mil
phone: 301-231-6700 Ext 270
More information about the Bioperl-l
mailing list