[Biopython-dev] Arlequin sequence files in Bio.Popgen
winda002 at student.otago.ac.nz
Sat Jul 18 08:17:02 UTC 2009
Hi again Tiago,
Sorry about falling of the grid before I could get back to you about this.
Tiago Antão wrote:
>> I've uploaded my Arlequin classes and
>> functions to a branch on github so you can see them (/Bio/PopGen/Arlequin/
>> on http://github.com/dwinter/biopython/tree/arleq-branch)
> This is great, I took your code and created a new version (nothing
> more than also an initial sketch - Feel free to disagree/propose
> changes), you can find it here:
Yeah, all the changes you talk about seem sensible to me
> OK, somebody has to do a parser to actually read the files in ;) .
> Which is the biggest piece of work to be done. I don't mind doing it
> (like in the next month or so - I have some free time now), but you
> can do it if you want. In case you decide to do it, I have just one
> major point to note: making a parser that is able to read big files
> (i.e., some files cannot be parsed into memory in one go). I made this
> mistake with the genepop parser and some people do complain about it.
> Somethings cannot be read as lists to memory but have to be read as
> iterators (issue 3 above).
> I think a parser that is able to handle lots of files is also good to
> help in building a sound model to represent an arlequin record.
> As usual we will need test code and documentation for all this ;)
This is where I have to admit to not having the time or the skills to
this justice, I'm happy to provide what help I can, (especially with
the docs and tests which are probably closer to my skill-set) but just
couldn't promise to do the bulk of the work.
There might also be another option, a bit of searching in github found this:
Open (MIT license) code for dealing with Arlequin in python. I'll
contact the author and ask if he is interested in contributing (it
can't hurt to ask right?)
More information about the Biopython-dev