[Bioperl-l] Job opening at Genentech [SSF, CA].

Wed Aug 5 16:16:04 UTC 2009

I have an opening in my group in the Bioinformatics department at
Genentech [South San Francisco, CA].  At the moment (for the next year
or so) our main focus is rebuilding and extending a system for
collecting, processing, and disseminating information about mutations
and variations (think web interfaces, relational databases,
alignments, workflows/pipelines).  In the future we'll pick up
projects related to next-gen sequencing (Me too!!!  In the future,
what isn't related to next-gen?), data integration, and/or
lab-specific projects.

First and foremost I'm looking for someone who's sharp and who enjoys
computers, biology, and technology; someone who gets excited about
picking up new tools but who also has a sense of responsibility and
restraint.

I'm looking for someone who's familiar with several languages and
tools; modern Perl complemented with C is my first choice these days,
supplemented with R and (when necessary) anything from the rest of the
programming language bestiary.  There's a fair amount of Java flying
around here too so familiarity with it and the JVM world will help.
Relational databases are part of the picture: Oracle for the big
stuff; SQLite, Postgresql, and MySQL play niche roles.  I generally
interact with them via ORM's, lately it's been Rose::DB::Object on the
Perl side though I've been convinced to take another look at
DBIx::Class.  Most of my web apps use CGI::Application, as fastcgi's,
mod_perl, or simple CGI scripts, but (as with ORM's) I may take
another look at Catalyst.

I'm looking for someone who's interested in building real software.
We'll be putting together a set of tools and data that need to hang
together and evolve for at least 4-5 years.  Deploy and run won't cut
it.  Requirements will change, so it's important to me that we build
things so they're as modular and flexible as possible.  Testing,
source control, and documentation matter.

A strong candidate will have an understanding of basic bioinformatics
concepts and the ability to pick up new biology and computer science
concepts as necessary.

At the junior end of the spectrum I'd expect a bachelor's degree + 3
years of experience, at the upper end would a masters + 5 years (or a
PhD interested in moving towards the production side of the house).

I can imagine running through one or more detail oriented interview
questions that drilled down (or took of on a tangent) from the
following:

  - What's the difference between Smith-Waterman, blast, sim4, gmap,
    and/or bowtie alignment algorithms or tools?  Which would you use
    when, and why?

  - Why is Moose better than Class::Accessor?  (yes, it's Perl
    centered, but it could spin out into any language [e.g. why is
    Java better than Perl?]).  What's a MOP?  Who cares?

  - CVS, subversion, git, mercurial.  You've already picked one?
    Which one?  Why?  Why not?

  - XML or JSON or YAML.  Pick one for moving data back and forth in
    an Ajax based interface.  Why?  Would it also work well in other
    contexts?

  - How would you store information about positional features on a
    genome so that you could get fast random access?  How would your
    solution tie into a larger data context?

Genentech's a great place to work: solid salaries, great benefits, Bay
Area location (who could ask for more?).  We're open source friendly
and with the arrival Robert Gentleman (our new Director, of
Bioconductor/R fame) likely to become more so.  The recent Roche
acquisition hasn't changed life much, it seems to mostly be a source
of opportunities for those of us in Research.

If you know anyone who fits the bill, have them drop me a note.

Thanks!

g.