Archive for December 2009
My last post linked to a PLoS paper by Dudley and Butte on developing effective bioinformatics programming skills. I asked, “How many hours do the authors think are needed to acquire these skills?” In response, Atul Butte said, “I think the ideal scenario is when one’s research projects enable one to learn these skills, so that these skills get learned in a practical way outside the classroom too, while doing science,” while Luis Pedro Coelho asked, “Does it matter over the long (or even medium) term? Isn’t improving your skills even you if aren’t being immediately productive what school is for?”
To which I can only respond, “Yeah, but that doesn’t work.” People have been doing computational science for almost seventy years, and have been calling it the third branch of science since (at least) the mid-1980s. If picking things up by osmosis was going to work as an educational strategy, we’d know by now. Instead, what we actually see hasn’t changed in 25 years: a small minority working wonders, and the vast majority not even knowing where they ought to start. We don’t expect grad students to pick up all the math and stats they need by osmosis, on their own, without any structured guidance—why should expect them to become proficient computationalists that way?
Via Titus Brown, a new PLoS paper titled “A Quick Guide for Developing Effective Bioinformatics Programming Skills” by Joel Dudley and Atul Butte. Their recommendations are:
- Programming languages
- Embracing open source
- Unix command-line skills
- Keeping projects documented and manageable
- Preserving source code with version control
- Embracing parallel computing paradigms
- Structuring data for speed and scalability
- Understanding the capabilities of hardware
- Embracing standards and interoperability
- Put a high value on your time
I think all these things matter, but:
- How many hours do the authors think are needed to acquire these skills? We’ve tried very hard to fit Software Carpentry into 25 hours of lecture and 50-100 hours of practical work because we recognize that every one of those hours is time students aren’t spending doing science.
- Shouldn’t testing be in the top 10? Or the top 5, or 3? These days, I care a lot more about how (and how well) someone tests than I do about their mastery of any particular programming language.
I’d be interested in hearing from anyone who has enough direct experience of the following NSF programs to know whether they might be willing to support Software Carpentry:
Nicola Scafetta is refusing to release the software on which he bases his claims that the sun is responsible for much of terrestrial warming during the last century. I obviously think that scientists should be required to do this as a condition of publication; coming as this does on the heels of Climategate, it will be interesting to see if journals finally start pushing in that direction. It also highlights the need to add more material to this course to cover packaging for release and data provenance.
Good post from Steve Easterbrook on why open-sourcing climate models probably wouldn’t make a difference.