[Bioperl-l] Retrieving from an indexed fasta file into a LargeSeq
object
Christopher Porter
cporter at ohri.ca
Mon Sep 27 15:07:31 EDT 2004
I don't think that I followed how the call to set the factory should be
used. After creating the Bio::Index::Fasta object, I called the
_get_SeqIO_object method as described. In return my script exited with
the exception below (I didn't supply an index to the _get_SeqIO_object
call).
------------- EXCEPTION -------------
MSG: Can't get filename for index :
STACK Bio::Index::Abstract::_file_handle
/Library/Perl/5.8.1/Bio/Index/Abstract.pm:662
STACK Bio::Index::AbstractSeq::_get_SeqIO_object
/Library/Perl/5.8.1/Bio/Index/AbstractSeq.pm:171
STACK toplevel ./parseBLAST.pl:172
--------------------------------------
This is somewhat of academic interest; I'm going to try to rewrite
using the Bio::DB::Fasta module instead - getting a slice of a large
sequence is exactly what I need to do.
Chris
On 27-Sep-04, at 9:35 AM, Jason Stajich wrote:
> Depends on how many games you want to play here and why you really
> want a LargeSeq. i.e. are you still going to call 'seq' on the large
> seq object to get the sequence as a string? Yes it can be done by
> changing the factory which SeqIO uses to create the sequence - if you
> look at the Index::AbstractSeq object.
>
> you'd want to call:
> (not pretty I know)
>
> $idx->_get_SeqIO_object->sequence_factory(Bio::Seq::Factory->new(-type
> => 'Bio::Seq::LargeSeq'));
>
>
> However, Lincoln's Bio::DB::Fasta module is better for handling this
> sort of thing I think as you can request virtual slices of the
> sequence data. I bet it will be much faster than how the LargeSeq
> implementation works although the two use the same idea of using the
> filesystem instead of memory for the seq storage. Just make sure your
> Fasta file is consistently formatted (all sequence lines are the same
> length, a quick
> 'sreformat fasta fafile > newfafile; mv newfafile fafile;' can take
> care of that).
>
> -jason
> On Sep 24, 2004, at 4:39 PM, Christopher Porter wrote:
>
>>
>> I have a fasta file containing large contig sequences, which I have
>> indexed using Bio::Index::Fasta. Is there a way to use the index to
>> retrieve sequences into a Bio::Seq::LargeSeq object rather than
>> Bio::Seq?
>>
>> What I'm currently doing is essentially:
>>
>> #!/usr/bin/perl
>>
>> use strict;
>> use Bio::SeqIO;
>> use Bio::Index::Fasta;
>>
>> my $idx = Bio::Index::Fasta->new('-filename'=>$hcindex);
>>
>> foreach my $acc(keys %$foo){
>> my $seqobj = $idx->fetch($acc);
>> ...
>> }
>>
>> How can I force $seqobj to be a LargeSeq?
>>
>> (At another point in the script I'm using SeqIO to read short
>> sequences from a non-indexed fasta file - I don't really want to use
>> LargeSeq for that part.)
>>
>>
>> Thanks,
>>
>> Chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> --
> Jason Stajich
> jason.stajich at duke.edu
> http://www.duke.edu/~jes12/
>
More information about the Bioperl-l
mailing list