[Biojava-l] read fasta entry by entry

Scooter Willis HWillis at scripps.edu
Wed May 9 11:02:43 UTC 2012


Mic

You can use the following where you set lazyloadsequences = true and the file will be indexed by the accession id in the fasta file. When you retrieve the sequence the underlying storage proxy framework knows the location of the string in the file based on its offset and will load it. Similar concept if the sequence and meta data was located at NCBI or Uniprot where we have different storage proxies that know how to get the data when it is needed.

LinkedHashMap<String, DNASequence> dnaSequenceList = FastaReaderHelper.readFastaDNASequence(fastaSequenceFile,lazyloadsequences);

Since the objects that need to be returned from the method call are specific to the data type DNASequence vs ProteinSequence it is expected that you what is in the file. DNASequence and ProteinSequence all extend from the same parent class typical use case is that you are writing a program specific to a data type. We should probably add a feature where you can ask isProteinSequence or isDNASequence of the file etc.

Thanks

Scooter



On 5/8/12 11:45 PM, "Mic" <mictadlo at gmail.com<mailto:mictadlo at gmail.com>> wrote:

Hello,
I have found this
http://biojava.org/wiki/BioJava:CookBook:Core:FastaReadWrite example, but
it looks like that the whole fasta file is stored in memory.

Is it possible to read any fasta entry by entry i.e. without
to specify whether it is DNA/Protein?

Thank you in advance.

Mic
_______________________________________________
Biojava-l mailing list  -  Biojava-l at lists.open-bio.org<mailto:Biojava-l at lists.open-bio.org>
http://lists.open-bio.org/mailman/listinfo/biojava-l





More information about the Biojava-l mailing list