[Biojava-l] DNA assembly

Khalil El Mazouari khalil.elmazouari at gmail.com
Wed Apr 24 14:29:24 UTC 2013


Hi Chris,

my application is deployed as war file. I am trying to avoid, as much as possible, to shell out to other none java programs... for maintainability reasons.

I don't think I need a 'full' genome assembly tools (eg velvet ...), it's overkill for my case: cloned gene is sequenced on both directions. Normally one strand is sufficient. If the sequence quality is not good enough, the 2 strands are used to get the full length gene. There is always a large overlap between the 2 strand sequence. 
I can QC the full length gene.

Best

khalil







-----

Confidentiality Notice: This e-mail and any files transmitted with it are private and confidential and are solely for the use of the addressee. It may contain material which is legally privileged. If you are not the addressee or the person responsible for delivering to the addressee, please notify that you have received this e-mail in error and that any use of it is strictly prohibited. It would be helpful if you could notify the author by replying to it.



On 24 Apr 2013, at 16:04, Chris Friedline wrote:

> Khalil,
> 
> Why not just shell out to programs designed for this purpose and pull in the results?  We are in the process of publishing a paper which uses PANDAseq to assemble overlapping PE reads.  The latest version of mothur also does this.
> 
> www.mothur.org
> https://github.com/neufeld/pandaseq/wiki/PANDAseq-Assembler
> 
> PANDAseq is particularly nice in this case, because you could read right from stderr and stdout streams.  It's also wicked fast.
> 
> Chris
> 
> On Apr 24, 2013, at 4:08 AM, Khalil El Mazouari <khalil.elmazouari at gmail.com> wrote:
> 
>> Hi,
>> 
>> It's not a global sequence alignment nor genome assembly. It's just a DNA fragment sequenced from both ends with an overlapping region. I want to assemble the 2 reads in order to get the full length sequence. This assembly is a part of a complex analysis process that uses biojava.
>> I agree, there a lot of simple option how to achieve this. But I need somthing in java/biojava.
>> 
>> Best
>> 
>> khalil
>> 
>> 
>> 
>> 
>> -----
>> 
>> Confidentiality Notice: This e-mail and any files transmitted with it are private and confidential and are solely for the use of the addressee. It may contain material which is legally privileged. If you are not the addressee or the person responsible for delivering to the addressee, please notify that you have received this e-mail in error and that any use of it is strictly prohibited. It would be helpful if you could notify the author by replying to it.
>> 
>> 
>> 
>> On 23 Apr 2013, at 23:38, Spencer Bliven wrote:
>> 
>>> If you just have two contiguous sequences to align, you should just use a global sequence alignment. See http://biojava.org/wiki/BioJava:CookBook3:PSA for how to do this in BioJava, or it might be easier to just use one of the online services for this such as http://www.ebi.ac.uk/Tools/psa/.
>>> 
>>> On the other hand, if you actually want to do genome assembly (ie from many overlapping reads), then there are much more computationally efficient methods. BioJava isn't really intended for large-scale genome assembly, so you'd want to use a sequence assembly tool (eg Velvet).
>>> 
>>> -Spencer
>>> 
>>> 
>>> On Tue, Apr 23, 2013 at 12:38 PM, Khalil El Mazouari <khalil.elmazouari at gmail.com> wrote:
>>> Hi,
>>> 
>>> I would like to assemble 2 overlapping DNA sequences. Is there something in biojava that may help in this task?
>>> 
>>> Thanks
>>> 
>>> 
>>> 
>>> 
>>> -----
>>> 
>>> Confidentiality Notice: This e-mail and any files transmitted with it are private and confidential and are solely for the use of the addressee. It may contain material which is legally privileged. If you are not the addressee or the person responsible for delivering to the addressee, please notify that you have received this e-mail in error and that any use of it is strictly prohibited. It would be helpful if you could notify the author by replying to it.
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>> 
>> 
>> 
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
> 
> 
> 
> 





More information about the Biojava-l mailing list