[EMBOSS] Oddcomp behaves oddly ...

Marc Logghe Marc.Logghe at DEVGEN.com
Wed Mar 8 11:36:06 UTC 2006


Hi David,
I am afraid there are some remaining oddities with oddcomp.
Tried another protein, other residue.
<file compseq.data>
Word size       1
Total count     0

S       4
<file compseq.data>

First a set of sequences is generated (kind of mimicking sliding window)
of length 20:
splitter wormpep:ZK822.4 -size 20 -overlap 19 > split.fa

Second, oddseq is run (with window option off by one):
oddcomp split.fa -window 19 -infile compseq.data 
#
# Output from 'oddcomp'
#
# The Expected frequencies are taken from the file: compseq.data
#
#       Word size: 1
        ZK822.4_36-55
        ZK822.4_37-56
        ZK822.4_38-57
        ZK822.4_39-58
        ZK822.4_40-59
        ZK822.4_41-60

#       END     #

The first 20mer:
>ZK822.4_36-55
SAGSSGSNFLSGLQNSSFGQ

It is clear that there are 7 S residues in this stretch and we were
looking for 4 or more, so that makes sense.
However, when you run oddseq again with S count of 5 instead of 4, no
sequence is reported !
Cheers,
Marc



> -----Original Message-----
> From: David Martin [mailto:david at compbio.dundee.ac.uk] 
> Sent: Wednesday, March 08, 2006 11:26 AM
> To: Marc Logghe; emboss at emboss.open-bio.org
> Subject: Re: [EMBOSS] Oddcomp behaves oddly ...
> 
> On 8/3/06 9:00 am, "Marc Logghe" <Marc.Logghe at devgen.com> wrote:
> 
> > ... Or rather, how should I use it properly ?
> > 
> > OK, suppose your run compseq to obtain the frequency for individual
> > residues:
> > compseq tsw:Q62671 -word 1
> > Apparently this example protein sequence is rather rich in leucine 
> > (106 L out of 889).
> > 
> > In order to detect this leucine bias, a little file was created
> > (leu.comp) that had the following content:
> > <file leu.comp>
> > Word size       1
> > Total count     0
> > 
> > # bias should be detected as 106 > 100
> > L       100
> > </file leu.comp>
> > 
> > Oddcomp was run like this:
> > oddcomp tsw:Q62671 -infile leu.comp -window 889
> 
> Try window 888 (ie shorter than the length of the sequence). 
> There are a couple of minor bugs in the oddcomp code that I 
> will forward to the team.
> 
> Basically what is happening is that there is a check for the 
> length of the sequence being shorter than the window.  It may 
> well be this that is giving the problem. 
> 
> It is a long time since I wrote this and C is not my usual 
> language so apologies if this is not a comprehensive answer.
> 
> ..d
> 
> > 
> > But the sequece is not reported.
> > When I change the L count to 10 in leu.comp it does not 
> work neither.
> > Strangely enough, when the default window is taken (30) the 
> sequence 
> > is reported.
> > What is happening here ?
> > 
> > Regards,
> > Marc
> > 
> > _______________________________________________
> > EMBOSS mailing list
> > EMBOSS at emboss.open-bio.org
> > http://newportal.open-bio.org/mailman/listinfo/emboss
> 
> 
> 




More information about the EMBOSS mailing list