[Bioperl-l] Request for advice and pointers on a project to
help biologists d o simple formatting and analysis
Todd Harris
harris at cshl.edu
Tue Mar 8 13:29:17 EST 2005
Hi Amir -
I like this idea. You could also have the scripts process @ARGV so no
hand-editing would be necessary. You might even just make the scripts
executable droplets which would be even easier to use.
Todd
> On 3/8/05 11:20 AM, Stefan Kirov wrote:
> I like a lot this idea.
> First my answer to your first 2 questions: no, no.
> But I bet may biologists would scream in pain just hearing the word
> console (as you mentioned). So I offer 0 step (bait to learn a little UNIX).
> Imagine a simple web form that is hooked to the perl interpreter (might
> be tricky from a security point, still it could be restricted in several
> ways) and does (amazingly) what the biologist types in. This would have
> to include file uploads/downloads as well. Of course the capabilities
> will be quite restricted, but the appetite comes with eating as some say
> and suddenly the console might be not a bad idea (thus Mac shares would
> go up :-) ).
>
> Amir Karger wrote:
>
>> Hi.
>>
>> I've gotten the impression - in my short time in bioinformatics - that
>> biologists get very frustrated with data formatting and analysis tasks.
>> Which is too bad, because many of these tasks are trivial for someone with a
>> bit of Perl knowledge. Then again, we can't force them to learn Perl, even
>> if it would be For Their Own Good.
>>
>> I was thinking it would be useful to have a toolkit of outrageously simple
>> Perl one-liners. Here's one:
>>
>> # Merge two lists, removing duplicates (logical OR)
>> perl -ne '$seen{$_}++; END {print keys %seen}' file1 file2 > outfile
>>
>> A biologist (call her Sue) would look through a website containing a bunch
>> of (searchable, categorized, etc.) scripts, cut & paste the Perl into Unix
>> (from a website), then backspace over the filenames and type in their own
>> filenames, and end up with something like this on the command line:
>>
>> myhost>perl -ne '$seen{$_}++; END {print keys %seen}' genes1 genes2 >
>> all_genes
>>
>> The biologist hits return & voilà! Instant data munging!
>>
>> Of course, I'm not the first one to identify this problem or try to solve
>> it. But I think I'm working on a slightly different problem than previous
>> solutions, and my (complete lack of) interface is different too. Here's the
>> "prior art" I've seen in this area, compared and contrasted with my idea
More information about the Bioperl-l
mailing list