[Bioperl-l] Retrieving from an indexed fasta file into a LargeSeq
object
Jason Stajich
jason.stajich at duke.edu
Mon Sep 27 09:35:22 EDT 2004
Depends on how many games you want to play here and why you really want
a LargeSeq. i.e. are you still going to call 'seq' on the large seq
object to get the sequence as a string? Yes it can be done by changing
the factory which SeqIO uses to create the sequence - if you look at
the Index::AbstractSeq object.
you'd want to call:
(not pretty I know)
$idx->_get_SeqIO_object->sequence_factory(Bio::Seq::Factory->new(-type
=> 'Bio::Seq::LargeSeq'));
However, Lincoln's Bio::DB::Fasta module is better for handling this
sort of thing I think as you can request virtual slices of the sequence
data. I bet it will be much faster than how the LargeSeq
implementation works although the two use the same idea of using the
filesystem instead of memory for the seq storage. Just make sure your
Fasta file is consistently formatted (all sequence lines are the same
length, a quick
'sreformat fasta fafile > newfafile; mv newfafile fafile;' can take
care of that).
-jason
On Sep 24, 2004, at 4:39 PM, Christopher Porter wrote:
>
> I have a fasta file containing large contig sequences, which I have
> indexed using Bio::Index::Fasta. Is there a way to use the index to
> retrieve sequences into a Bio::Seq::LargeSeq object rather than
> Bio::Seq?
>
> What I'm currently doing is essentially:
>
> #!/usr/bin/perl
>
> use strict;
> use Bio::SeqIO;
> use Bio::Index::Fasta;
>
> my $idx = Bio::Index::Fasta->new('-filename'=>$hcindex);
>
> foreach my $acc(keys %$foo){
> my $seqobj = $idx->fetch($acc);
> ...
> }
>
> How can I force $seqobj to be a LargeSeq?
>
> (At another point in the script I'm using SeqIO to read short
> sequences from a non-indexed fasta file - I don't really want to use
> LargeSeq for that part.)
>
>
> Thanks,
>
> Chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/
More information about the Bioperl-l
mailing list