[Biopython] sff inot fasta and qual then trim
Kiss, Csaba
csaba.kiss at lanl.gov
Tue Oct 23 16:47:23 UTC 2012
Hi Christopher!
I am writing a python script to analyze antibody sequences. I have been using mothur to convert the sff files to fasta and then trim the sequences for quality.
For the end-users' sake, it would be easier if all they needed to install was python and can go around mothur. I have been happy with mothur until now when I tried to use it in my desktop computer and it took 3 hours to convert 3 million read from sff to fasta. I hoped that pure python would be faster.
I will look at Pycogent and QIIME.
Thanks
Csaba
-----Original Message-----
From: Christopher Friedline [mailto:cfriedline at mymail.vcu.edu] On Behalf Of Chris Friedline
Sent: Tuesday, October 23, 2012 10:39 AM
To: Kiss, Csaba
Cc: biopython at lists.open-bio.org
Subject: Re: [Biopython] sff inot fasta and qual then trim
Are you trying to replace an entire analysis pipeline, which mothur provides, or simply take control of the read trimming routines? Mothur has been excellent for us (though I do supplement with my own code frequently), and I have a hard time believing that BioPython (or Python, in general) would be faster for these types of things. If you are married to Python, you may want to join in with the QIIME people, though they back their stuff with PyCogent rather than BioPython. Both are excellent packages for automating some parts of the analysis in microbial community studies. We can leave the philosophy of pipelining scientific research for another thread. ;-)
I wonder if the reimplementation effort of common trimming/filtering tasks are worth your time, given the current maturity of both mothur and QIIME.
On Oct 23, 2012, at 12:04 PM, "Kiss, Csaba" <csaba.kiss at lanl.gov> wrote:
> I am new to bio-python. I am trying to replace mothur with BioPython.
> I hope that biopython is faster than mothur. All I want to do is this:
>
> sffinfo(sff=sd11.fasta)
> trim.seqs(fasta=sd11.fasta, qfile=sd11.qual, minlength = 50,
> maxhomop=8, qwindowsize=50, qwindowaverage =22)
>
> Can someone help me to translate the two mothur statements above to biopython, please?
> It would be greatly appreciated.
> thanks
>
>
> --
> Best Regards:
> Csaba Kiss PhD, MSc, BSc
> TA-43, HRL-1, MS888
> Los Alamos National Laboratory
> Work: 1-505-667-9898
> Cell: 1-505-920-5774
>
>
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
PhD Candidate, Integrative Life Sciences Virginia Commonwealth University Richmond, VA
More information about the Biopython
mailing list