[Open-bio-l] Best practices for quality trimming?

Peter biopython at maubp.freeserve.co.uk
Thu Dec 3 14:29:29 UTC 2009


On Thu, Dec 3, 2009 at 1:12 PM, Dan Bolser <dan.bolser at gmail.com> wrote:
> What is there a Standard Operating Procedure (SOP) for quality
> trimming reads? i.e. which tool, what settings and for what purpose?
>
> It seems that, when using a window, the median quality of the window
> should be used as the threshold for deciding where to 'end clip'
> sequences.
>
> Is there a database of the assemblers, for example, that do or don't
> take quality information into account when assembling?

Hi Dan,

It was nice to say hello again in Edinburgh this week:
http://www.sbforum.org/earchive.php?e_id=79

As the group discussed, this is tricky - especially as it will depend
greatly on what you are going to do with the reads next (e.g.
assembly or mapping onto a reference) and which tools. For
velvet trimming seems to help (especially in terms of reducing
the memory demands).

If we can settle on a reasonable set of procedures, it would
be great to have implementations in EMBOSS (i.e. this could
be the "quaffle" tool Peter Rice has suggested) plus BioPerl,
Biopython etc. The later would be especially useful as base
points for people to modify the algorithm to try new ideas.

See also:
http://lists.open-bio.org/pipermail/emboss/2009-December/003788.html

> I'm working on a software database for NGS tools here:
>
> http://seqwiki.com
>
> (It's still quite beta, and at some point it may move to http://bifx.org/wiki)

Currently it points at http://seqanswers.com/wiki/SEQanswers
which is perhaps a good idea given the good reputation of
seqanswers.

Peter C.



More information about the Open-Bio-l mailing list