[Bioperl-l] Request for advice and pointers on a project to help biologists d o simple formatting and analysis

Todd Harris harris at cshl.edu
Tue Mar 8 13:29:17 EST 2005


Hi Amir - 

I like this idea.  You could also have the scripts process @ARGV so no
hand-editing would be necessary.   You might even just make the scripts
executable droplets which would be even easier to use.

Todd

> On 3/8/05 11:20 AM, Stefan Kirov wrote:

> I like a lot this idea.
> First my answer to your first 2 questions: no, no.
> But I bet may biologists would scream in pain just hearing the word
> console (as you mentioned). So I offer 0 step (bait to learn a little UNIX).
> Imagine a simple web form that is hooked to the perl interpreter (might
> be tricky from a security point, still it could be restricted in several
> ways) and does (amazingly) what the biologist types in. This would have
> to include file uploads/downloads as well. Of course the capabilities
> will be quite restricted, but the appetite comes with eating as some say
> and suddenly the console might be not a bad idea (thus Mac shares would
> go up :-) ).
> 
> Amir Karger wrote:
> 
>> Hi.
>> 
>> I've gotten the impression - in my short time in bioinformatics - that
>> biologists get very frustrated with data formatting and analysis tasks.
>> Which is too bad, because many of these tasks are trivial for someone with a
>> bit of Perl knowledge. Then again, we can't force them to learn Perl, even
>> if it would be For Their Own Good.
>> 
>> I was thinking it would be useful to have a toolkit of outrageously simple
>> Perl one-liners.  Here's one:
>> 
>>    # Merge two lists, removing duplicates (logical OR)
>>    perl -ne '$seen{$_}++; END {print keys %seen}' file1 file2 > outfile
>> 
>> A biologist (call her Sue) would look through a website containing a bunch
>> of (searchable, categorized, etc.) scripts, cut & paste the Perl into Unix
>> (from a website), then backspace over the filenames and type in their own
>> filenames, and end up with something like this on the command line:
>> 
>> myhost>perl -ne '$seen{$_}++; END {print keys %seen}' genes1 genes2 >
>> all_genes
>> 
>> The biologist hits return & voilà! Instant data munging!
>> 
>> Of course, I'm not the first one to identify this problem or try to solve
>> it.  But I think I'm working on a slightly different problem than previous
>> solutions, and my (complete lack of) interface is different too.  Here's the
>> "prior art" I've seen in this area, compared and contrasted with my idea


More information about the Bioperl-l mailing list