[Biojava-l] Biojava-l Digest, Vol 105, Issue 12
Khalil El Mazouari
khalil.elmazouari at gmail.com
Wed Oct 19 18:36:28 UTC 2011
Hi Hannes,
just did a MSA test with 521 seq... and it works. It must be a memory issue.
try something like: java -Xmx1g -jar yourApp.jar args...
If you don't have enough RAM, try with 500m as suggested by Andreas,
Regards,
Khalil
On 19 Oct 2011, at 18:00, biojava-l-request at lists.open-bio.org wrote:
> Send Biojava-l mailing list submissions to
> biojava-l at lists.open-bio.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.open-bio.org/mailman/listinfo/biojava-l
> or, via email, send a message with subject or body 'help' to
> biojava-l-request at lists.open-bio.org
>
> You can reach the person managing the list at
> biojava-l-owner at lists.open-bio.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Biojava-l digest..."
>
>
> Today's Topics:
>
> 1. Re: Multiple Sequence Alignment - Limits? (Andreas Prlic)
> 2. Re: Multiple Sequence Alignment - Limits?
> (Hannes Brandst?tter-M?ller)
> 3. Re: Multiple Sequence Alignment - Limits?
> (Hannes Brandst?tter-M?ller)
> 4. Re: Multiple Sequence Alignment - Limits? (Spencer Bliven)
> 5. Status of org.biojava3.data.sequence.SequenceUtil ?
> (jvb at Cs.Nott.AC.UK)
> 6. Re: Status of org.biojava3.data.sequence.SequenceUtil ?
> (Peter Troshin)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 18 Oct 2011 14:01:05 -0700
> From: Andreas Prlic <andreas at sdsc.edu>
> Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits?
> To: Hannes Brandst?tter-M?ller <biojava at hannes.oib.com>
> Cc: biojava-l <biojava-l at lists.open-bio.org>
> Message-ID:
> <CALthepz+7+KO1jo5gfiYHtJ-27jojGaHje7D55-B0sHuaZdqYw at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi Hannes,
>
> did you try to increase memory settings for your JVM? e.g. -Xmx500M
>
> Andreas
>
> On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller
> <biojava at hannes.oib.com> wrote:
>> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller
>> <biojava at hannes.oib.com> wrote:
>>> Hi again!
>>>
>>> I am quite happy with the Multiple Sequence Alignment, but I noticed
>>> that there seems to be a limit of 132 Sequences that are present in
>>> the final alignment - is this some kind of hardcoded limit, or can I
>>> work around that somehow?
>>>
>>> Hannes
>>>
>>
>> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in
>> fasta format. Is there a way to work around that limit?
>>
>> Hannes
>>
>> _______________________________________________
>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 19 Oct 2011 06:36:19 +0200
> From: Hannes Brandst?tter-M?ller <biojava at hannes.oib.com>
> Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits?
> To: Andreas Prlic <andreas at sdsc.edu>
> Cc: biojava-l <biojava-l at lists.open-bio.org>
> Message-ID:
> <CAPXi2mkBsBJhzfHKtgquytXPV=hF0TZNYAM8cfCcen=QfmAj+A at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi Andreas,
>
> I will try that later today if that makes any difference; I ran a
> larger alignment batch overnight, and I noticed that this limit seems
> to have been a coincidence; HOWEVER, the aligned sequences are always
> not as many as the input sequences, is this caused by memory
> constraints or how can I influence that?
>
> Hannes
>
> On Tue, Oct 18, 2011 at 23:01, Andreas Prlic <andreas at sdsc.edu> wrote:
>> Hi Hannes,
>>
>> did you try to increase memory settings for your JVM? ?e.g. -Xmx500M
>>
>> Andreas
>>
>> On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller
>> <biojava at hannes.oib.com> wrote:
>>> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller
>>> <biojava at hannes.oib.com> wrote:
>>>> Hi again!
>>>>
>>>> I am quite happy with the Multiple Sequence Alignment, but I noticed
>>>> that there seems to be a limit of 132 Sequences that are present in
>>>> the final alignment - is this some kind of hardcoded limit, or can I
>>>> work around that somehow?
>>>>
>>>> Hannes
>>>>
>>>
>>> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in
>>> fasta format. Is there a way to work around that limit?
>>>
>>> Hannes
>>>
>>> _______________________________________________
>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>
>>
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 19 Oct 2011 09:32:25 +0200
> From: Hannes Brandst?tter-M?ller <biojava at hannes.oib.com>
> Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits?
> To: Spencer Bliven <sbliven at ucsd.edu>
> Cc: biojava-l <biojava-l at lists.open-bio.org>
> Message-ID:
> <CAPXi2mnzkvLasRWixinxyh=w=V6bjePXRPhTYGF54ac5dFN57w at mail.gmail.com>
> Content-Type: text/plain; charset=windows-1252
>
> I'm currently running another test, now with even more memory for java
> (500M) - it looks fine now so far. I'll re-check it later with the
> other files that gave me some problems, and will report back later
> today.
>
> I had a "out of heap" exception when I tried it with the default
> memory settings, and with 256M it seems to have swallowed some
> sequences - I'll re-check and help you reproduce. It would be really
> bad if the code would swallow sequences without error messages when
> running out of memory, so I'll make sure I have proof :D
>
> Hannes
>
> On Wed, Oct 19, 2011 at 09:22, Spencer Bliven <sbliven at ucsd.edu> wrote:
>> Hannes?
>>
>> There should not be a limit on the number of sequences, nor should you be
>> running into a memory problem. The FastaParser should be able to read
>> thousands of sequences, since it is used for genome FASTA files as well as
>> multiple alignments. My guess would be either a malformed FASTA file
>> (perhaps a problem with line endings?), or else a problem with the code to
>> generate the MultipleAlignment. Can you post some code snippets?
>>
>> -Spencer
>>
>> On Tue, Oct 18, 2011 at 21:36, Hannes Brandst?tter-M?ller
>> <biojava at hannes.oib.com> wrote:
>>>
>>> Hi Andreas,
>>>
>>> I will try that later today if that makes any difference; I ran a
>>> larger alignment batch overnight, and I noticed that this limit seems
>>> to have been a coincidence; HOWEVER, the aligned sequences are always
>>> not as many as the input sequences, is this caused by memory
>>> constraints or how can I influence that?
>>>
>>> Hannes
>>>
>>> On Tue, Oct 18, 2011 at 23:01, Andreas Prlic <andreas at sdsc.edu> wrote:
>>>> Hi Hannes,
>>>>
>>>> did you try to increase memory settings for your JVM? ?e.g. -Xmx500M
>>>>
>>>> Andreas
>>>>
>>>> On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller
>>>> <biojava at hannes.oib.com> wrote:
>>>>> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller
>>>>> <biojava at hannes.oib.com> wrote:
>>>>>> Hi again!
>>>>>>
>>>>>> I am quite happy with the Multiple Sequence Alignment, but I noticed
>>>>>> that there seems to be a limit of 132 Sequences that are present in
>>>>>> the final alignment - is this some kind of hardcoded limit, or can I
>>>>>> work around that somehow?
>>>>>>
>>>>>> Hannes
>>>>>>
>>>>>
>>>>> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in
>>>>> fasta format. Is there a way to work around that limit?
>>>>>
>>>>> Hannes
>>>>>
>>>>> _______________________________________________
>>>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> Biojava-l mailing list ?- ?Biojava-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 19 Oct 2011 00:22:55 -0700
> From: Spencer Bliven <sbliven at ucsd.edu>
> Subject: Re: [Biojava-l] Multiple Sequence Alignment - Limits?
> To: Hannes Brandst?tter-M?ller <biojava at hannes.oib.com>
> Cc: biojava-l <biojava-l at lists.open-bio.org>
> Message-ID:
> <CA+P6arns8F9XeZj4UK0hEmvV_uOG5nJxoZxJ5fBQ=nDh-xuXmg at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> Hannes?
>
> There should not be a limit on the number of sequences, nor should you be
> running into a memory problem. The FastaParser should be able to read
> thousands of sequences, since it is used for genome FASTA files as well as
> multiple alignments. My guess would be either a malformed FASTA file
> (perhaps a problem with line endings?), or else a problem with the code to
> generate the MultipleAlignment. Can you post some code snippets?
>
> -Spencer
>
> On Tue, Oct 18, 2011 at 21:36, Hannes Brandst?tter-M?ller <
> biojava at hannes.oib.com> wrote:
>
>> Hi Andreas,
>>
>> I will try that later today if that makes any difference; I ran a
>> larger alignment batch overnight, and I noticed that this limit seems
>> to have been a coincidence; HOWEVER, the aligned sequences are always
>> not as many as the input sequences, is this caused by memory
>> constraints or how can I influence that?
>>
>> Hannes
>>
>> On Tue, Oct 18, 2011 at 23:01, Andreas Prlic <andreas at sdsc.edu> wrote:
>>> Hi Hannes,
>>>
>>> did you try to increase memory settings for your JVM? e.g. -Xmx500M
>>>
>>> Andreas
>>>
>>> On Tue, Oct 18, 2011 at 2:46 AM, Hannes Brandst?tter-M?ller
>>> <biojava at hannes.oib.com> wrote:
>>>> On Tue, Oct 18, 2011 at 11:32, Hannes Brandst?tter-M?ller
>>>> <biojava at hannes.oib.com> wrote:
>>>>> Hi again!
>>>>>
>>>>> I am quite happy with the Multiple Sequence Alignment, but I noticed
>>>>> that there seems to be a limit of 132 Sequences that are present in
>>>>> the final alignment - is this some kind of hardcoded limit, or can I
>>>>> work around that somehow?
>>>>>
>>>>> Hannes
>>>>>
>>>>
>>>> Sorry, I counted that wrong. I had 132 lines, that is 66 sequences in
>>>> fasta format. Is there a way to work around that limit?
>>>>
>>>> Hannes
>>>>
>>>> _______________________________________________
>>>> Biojava-l mailing list - Biojava-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>>
>>>
>>
>> _______________________________________________
>> Biojava-l mailing list - Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
>
>
> ------------------------------
>
> Message: 5
> Date: 19 Oct 2011 12:15:26 +0100
> From: jvb at Cs.Nott.AC.UK
> Subject: [Biojava-l] Status of org.biojava3.data.sequence.SequenceUtil
> ?
> To: biojava-l <biojava-l at lists.open-bio.org>
> Message-ID: <201110191215.aa17789 at pat.Cs.Nott.AC.UK>
> Content-Type: text/plain; format=flowed; charset=ISO-8859-1
>
> Hello,
>
> I can't find a jar containing org.biojava3.data.sequence.SequenceUtil, even
> though it appears in the JavaDocs:
> http://www.biojava.org/docs/api/org/biojava3/data/sequence/SequenceUtil.html
>
> What is it's status? Can I get it, and should it rely on it if I can?
>
> Thanks,
>
> Jon
>
>
>
>
> ------------------------------
>
> Message: 6
> Date: Wed, 19 Oct 2011 16:13:44 +0100
> From: Peter Troshin <p.v.troshin at dundee.ac.uk>
> Subject: Re: [Biojava-l] Status of
> org.biojava3.data.sequence.SequenceUtil ?
> To: jvb at cs.nott.ac.uk
> Cc: biojava-l at lists.open-bio.org
> Message-ID: <4E9EE928.4050506 at dundee.ac.uk>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Hi Jon,
>
> This class is a part of protein disorder prediction JAR and a recent
> addition to BioJava. You are welcome to use if it suits your needs.
> Bear in mid though that the FASTA file reader from this class reads the
> content of the whole FASTA file at once, i.e. if you are working with
> large FASTA files you will want to use something else instead. I've got
> a Stream based FASTA reader if you need one and if there is not one in
> BioJava already.
> I would imagine the functionality from this class is not going to
> disappear overnight, but it may and perhaps should be merged with other
> FASTA parsers in BioJava once somebody have time to do this.
>
> Regards,
> Peter
>
>
> On 19/10/2011 12:15, jvb at cs.nott.ac.uk wrote:
>> Hello,
>>
>> I can't find a jar containing org.biojava3.data.sequence.SequenceUtil,
>> even though it appears in the JavaDocs:
>> http://www.biojava.org/docs/api/org/biojava3/data/sequence/SequenceUtil.html
>>
>> What is it's status? Can I get it, and should it rely on it if I can?
>>
>> Thanks,
>>
>> Jon
>>
>>
>> _______________________________________________
>> Biojava-l mailing list - Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>
>
>
> ------------------------------
>
> _______________________________________________
> Biojava-l mailing list - Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>
>
> End of Biojava-l Digest, Vol 105, Issue 12
> ******************************************
More information about the Biojava-l
mailing list