[Biopython-dev] [Bug 2643] Proposal: fastPhaseOutputIO for SeqIO

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu Nov 6 15:11:38 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2643





------- Comment #7 from biopython-bugzilla at maubp.freeserve.co.uk  2008-11-06 10:11 EST -------
I've now had a quick look at the fastPHASE documentation, and I have the
impression that the sequences should always come in pairs:

"Output files for inferred haplotypes or imputed genotypes contain two lines 
per given diploid individual, with the order of individuals corresponding to 
that supplied in the input file."

Assuming the paired sequences are always the same length, this does suggest the
format should be integrated into Bio.AlignIO (giving pairwise alignments)
rather than Bio.SeqIO.

Have you tried not estimating the haplotypes (by supplying a negative integer
following -H), and does this alter the sequence output?

Finally could you try the -Z command line argument for the simplified output
format (described as two lines per individual, without “id” lines,
subpopulation labels or summary information from the run).  Does this have the
sequences?  If so this may be a more parser friendly set of output to parse for
Bio.SeqIO and/or Bio.AlignIO.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list