[GSoC] Questions on next steps for MAF parsing for bio-maf

Wed Jul 11 09:25:17 UTC 2012

Hi Clayton and mentors,

I think it would be extremely useful to get someone in who uses MAF
in a pipeline. I know Raoul does, but we need more users. Anyone you
know using MAF daily? Otherwise we should post on the Bio* lists.

Same for GFF3 and Marjan. Anyone you know out there?

Pj.

On Tue, Jul 10, 2012 at 07:45:33PM -0400, Clayton Wheeler wrote:
> Hi all,
> 
> In the course of working out my plan for the rest of my bio-maf project, I have come up with a few questions I'm not able to answer:
> 
> https://github.com/csw/bioruby-maf/wiki/Questions
> 
> * Is it useful to build indexes on other sequences besides the reference sequence?
> 
> * Should the score field of an alignment block be zeroed or removed whenever the block is modified?
> 
> * How, precisely, should selection based on features in GTF/GFF3 files work?
> 
> * When converting a MAF Block/Sequence to bio-alignment representation, how should we handle quality metadata (from 'q' lines), which is tied to the actual sequence data and would need to be maintained in parallel if a column were deleted?
> 
> * Is supporting the bx-python index format still desirable? Performance with Kyoto Cabinet indexes seems competitive, and the indexes are neither very large nor very expensive to build.
> 
> * Blankenberg et al. mention this filtering mode: "removing blocks which have aligned species occurring between non-syntenic chromosomes or strands" which is unfortunately a bit cryptic.
> 
> * Are coverage statistics useful or appropriate to provide?
> 
> Any insight that you might be able to offer would be helpful.
> 
> Thanks,
> 
> Clayton Wheeler
> cswh at umich.edu
> 
> 
> 
> 
> _______________________________________________
> GSoC mailing list
> GSoC at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/gsoc