New program: makeseq
Henrikki Almusa
henrikki.almusa at helsinki.fi
Mon Sep 27 08:33:16 UTC 2004
On Friday 24 September 2004 18:02, Dr J.C. Ison wrote:
> > 1. Acd handling
> The cleanest way to do this is to use a "Toggle" ACD data item.
> e.g.
<snip example>
Ok, this part seems to work ok. I put toggle for both data file and insert
info.
> You don't need to prompt the user for sequnce type though, because
> "sequence" data items have attributes:
>
> sequence: sequence
> [
> parameter: "Y"
> type: protein
> ]
>
> sequence.begin (start residue, i.e. -sbegin value)
<snip sequence infolist>
>
> You access them in ACD by e.g. $(sequence.begin) etc.
> e.g. to ensure your insert isn't past the end of the sequence use
> maximum: $(sequence.end)
Well, i don't have a sequence there anywhere. And the problem also comes from
the fact that data file can determine the type as well. It is now queried if
the data file is not given. And since the insert is counted within the
sequence length the maximium place to start the insert is lenght -
insert.length. That calculation doesn't seem to work either, so I'm checking
that inside the code.
> > 2. Segfaults
> If you really can't fix it get back in touch and I can run it through
> Purify.
That would be nice. I honestly can't figure this one out. I checked that the
insert goes there (inserts ajpstr can be printed with ajFmtPrint() before
test).
> > 3. Uniformity
> I'm presuming the 10 and 40 are size of your two arrays. If you want
> to treat them as strings you have to leave space for your terminating
> NULL, so 41 and 11 would do it. All abitrary limits really should be
> avoided though, use e.g.
>
> AjPStr seqCharProtPure=NULL;
> seqCharProtPure=ajStrNewC("ACDEFGHIKLMNPQRSTVWYacdefghiklmnpqrstvwy");
>
> and ajStrChar to return a single character from a string at a given
> position.
I use the length to tell me size of the char array that exists. Then when
creating a random sequence, i can just ask random number between 0 and length
to get a character for sequence. Well there is one abstraction layer between
that char array and the final one used in randomised selection, but thats
because of cusp. There is no arbitrary limits as such. Usage of the above
char arrays are in makeseq_default_chars function.
>> Is there a way to use something more generic, so that if
> > emboss changes these things, they would be applied to this program as
> > well?
>
> There might (perhaps should!) be - Alan Bleasby
> (ableasby at rfcgr.mrc.ac.uk) is the best man to ask about that.
Ok. I'll put another post to emboss-dev later on this.
> I've attached the template I use for the DOMAINATRIX documentation, e.g.
>
> http://www.rfcgr.mrc.ac.uk/Software/EMBOSS/Apps/domainatrix/rocon.html
> With this template, I document stuff by hand. The only external program
> I use is "acdtable" to get the ACD stuff. This is slightly different
> from the format used for EMBOSS apps though.
So there is no script to run to get basic info from acd file into html file.
Then its just manual labour of copying and writing html file :).
> Hope this helps and thanks for the interest
>
> Cheers
>
> Jon
Thanks for help. I attached the new versions of .c and .acd files.
--
Henrikki Almusa
-------------- next part --------------
A non-text attachment was scrubbed...
Name: makeseq.c
Type: text/x-csrc
Size: 8999 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/emboss-dev/attachments/20040927/eea2d167/attachment-0001.bin>
-------------- next part --------------
application: makeseq [
documentation: "Creates random sequences"
groups: "Edit"
]
section: required [
information: "Required section"
type: "page"
]
integer: amount [
standard: "Y"
default: "100"
minimum: "1"
information: "Number of sequences"
]
integer: length [
standard: "Y"
default: "100"
minimum: "1"
information: "Length of single sequence"
]
toggle: useinsert [
standard: "Y"
information: "Do you want to make an insert"
default: "N"
]
string: insert [
standard: "$(useinsert)"
information: "Inserted string"
help: "String that is inserted into sequence"
# nullok: "Y"
knowntype: "sequence"
]
integer: start [
standard: "$(useinsert)"
information: "Start point of inserted sequence"
minimum: "1"
default: "1"
# maximum: "@($(length) - @($(insert) ? $(insert.length)-1 : 0))"
]
toggle: usedata [
standard: "Y"
information: "Do you want to use distribution file"
default: "N"
]
endsection: required
section: input [
information: "Input section"
type: "page"
]
infile: data [
standard: "$(usedata)"
information: "Distribution file"
help: "This file should be pepstats output file to create protein
sequences or cusp output to create nucleotide sequence. Nucleotide
sequences will be created as triplets with end trimmed to be
correct length."
nullok: "Y"
]
endsection: input
section: advanced [
information: "Advanced section"
type: "page"
]
boolean: protein [
standard: "@($(usedata) ? N : Y)"
default: "N"
information: "Make protein sequences"
]
endsection: advanced
section: output [
information: "Output section"
type: "page"
]
seqoutall: outseq [
parameter: "Y"
type: "any"
name: "makeseq"
]
endsection: output
More information about the emboss-dev
mailing list