[Biopython] MUSCLE gap extend penalty

Joshua Meyers Joshua.Meyers at icr.ac.uk
Wed Mar 15 10:58:51 UTC 2017


Hi Peter,

Thanks for the reply. I have filed an issue on Github, I will see if I can get round to the pull request.
For the short term, if anyone else wants to use this wrapper, the quick fix is just to add missing options manually to the string…

    …..
    muscle_cline = MuscleCommandline(clwstrict=True, gapopen=-11.0, center=0.0)
    #MUSCLE command line wrapper does not accept all the required arguments, therefore add these manually
    muscle_cline = '%s -gapextend -1.0 -matrix %s' %(str(muscle_cline), ncbi_matrix_path)
    child = subprocess.Popen(str(muscle_cline),
    …..

Cheers,

Josh




> On 14 Mar 2017, at 22:24, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> 
> Hi Joshua
> 
> You're exactly right - this isn't in the Biopython wrapper (yet), only
> the gapopen option is there. Perhaps gapextend was new in Muscle 3.8?
> 
> Could you have a look at the code for the wrapper and see if you think
> you could open a pull request to add this?
> https://github.com/biopython/biopython/blob/master/Bio/Align/Applications/_Muscle.py
> 
> In any case, please file an issue on Github about this:
> https://github.com/biopython/biopython/issues
> 
> Thanks,
> 
> Peter
> 
> On Tue, Mar 14, 2017 at 1:09 PM, Joshua Meyers <Joshua.Meyers at icr.ac.uk> wrote:
>> Hi All,
>> 
>> I am using the MUSCLE command line wrapper for pairwise sequence alignments
>> (using StringIO but that isn’t the issue here…).
>> It works nicely, until I try to specify a gapextend penalty. It seems that
>> this Option was neglected in the command line wrapper?
>> The Option does exist in the MUSCLE docs:
>> http://www.drive5.com/muscle/muscle_userguide3.8.html
>> 
>> records =
>> [SeqRecord(full_ref_seq,id="ref"),SeqRecord(full_fit_seq,id="fit")]
>> muscle_cline = MuscleCommandline(clwstrict=True, matrix='blosum62',
>> gapopen=-11.0, gapextend=-1.0, center=0.0)
>> child = subprocess.Popen(str(muscle_cline),
>>                         stdin=subprocess.PIPE,
>>                         stdout=subprocess.PIPE,
>>                         stderr=subprocess.PIPE,
>>                         universal_newlines=True,
>>                         shell=(sys.platform!="win32"))
>> SeqIO.write(records, child.stdin, "fasta")
>> child.stdin.close()
>> align = AlignIO.read(child.stdout, format="clustal”)
>> 
>> ValueError: Option name gapextend was not found.
>> 
>> 
>> Upon inspecting the muscle_cline object, this kwarg is indeed absent (it
>> exists in the analogous clustal wrapper).
>> print dir(muscle_cline)
>> 
>> ['__call__', '__class__', '__delattr__', '__dict__', '__doc__',
>> '__format__', '__getattribute__', '__hash__', '__init__', '__module__',
>> '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
>> '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_check_value',
>> '_clear_parameter', '_get_parameter', '_validate', 'anchors',
>> 'anchorspacing', 'center', 'cluster1', 'cluster2', 'clw', 'clwout',
>> 'clwstrict', 'clwstrictout', 'core', 'diaglength', 'diagmargin', 'diags',
>> 'distance1', 'distance2', 'fasta', 'fastaout', 'gapopen', 'group', 'html',
>> 'htmlout', 'hydro', 'hydrofactor', 'in1', 'in2', 'input', 'le', 'log',
>> 'loga', 'maxdiagbreak', 'maxhours', 'maxiters', 'maxtrees',
>> 'minbestcolscore', 'minsmoothscore', 'msf', 'msfout', 'noanchors', 'nocore',
>> 'objscore', 'out', 'parameters', 'phyi', 'phyiout', 'phys', 'physout',
>> 'profile', 'program_name', 'quiet', 'refine', 'root1', 'root2', 'seqtype',
>> 'set_parameter', 'smoothscoreceil', 'smoothwindow', 'sp', 'spn', 'stable',
>> 'sueff', 'sv', 'tree1', 'tree2', 'verbose', 'version', 'weight1', 'weight2']
>> 
>> 
>> 
>> Is there another way around this? Any help would be much appreciated.
>> 
>> Thanks in advance,
>> 
>> Josh
>> 
>> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
>> Company Limited by Guarantee, Registered in England under Company No. 534147
>> with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
>> 
>> This e-mail message is confidential and for use by the addressee only. If
>> the message is received by anyone other than the addressee, please return
>> the message to the sender by replying to it and then delete the message from
>> your computer and network.
>> 
>> _______________________________________________
>> Biopython mailing list  -  Biopython at mailman.open-bio.org
>> http://mailman.open-bio.org/mailman/listinfo/biopython


The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addressee only.  If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network.



More information about the Biopython mailing list