[Biojava-dev] Genbank read/write

Peter Cock p.j.a.cock at googlemail.com
Thu Feb 12 11:46:18 UTC 2015


On Thu, Feb 12, 2015 at 1:58 AM, stefan harjes <stefanharjes at yahoo.de> wrote:
> Hi Paolo, Andreas,
>
> I am sorry if I sounded disrespectful.
>
> I would like to point out that a new user gets confused by what has been
> published about the biojava library. There are several places where you
> strongly indicate that concatenated sequences are the intended design (see
> citations below).

I would be very surprised if BioJava's earlier contributors would
*concatenate* multiple separate sequences in a file into one
long sequence.

Rather, much like the BioPerl and Biopython SeqIO libraries, surely
the design is to allow multiple separate sequence records to be
parsed from most sequence file formats? e.g. FASTA, FASTQ,
GenBank, etc can all hold zero or more records.

For Biopython at least, we have an iterator approach for all our
SeqIO sequence parsers, with a helper function for when you
expect the file to contain one and only one record (in which case
the for loop style used with an iterator is cumbersome).

Perhaps we are understanding different meanings from "concatenate"?

Regards,

Peter


More information about the biojava-dev mailing list