Software Carpentry

Helping scientists make better software since 1997

Feedback and Boundaries

Thanks to the initiative of Dominique Vuvan (who took Software Carpentry last summer), we ran a semi-formal version of the course from last November through to this past week for grad students in Psychology, Linguistics, and a few other disciplines at the University of Toronto. Weekly tutorials were offered in both Python and MATLAB by graduate teaching assistants from Computer Science, covering roughly half of the existing material.

Three of the students and five of the TAs spent an hour with me on Thursday discussing what went right and what went wrong. The chance to work through their own problems with some expert assistance was at the top of the former list for all of the students: they all felt that they learned most by bringing their own “homework” to class and having the TAs show them how to tackle it. I’m pleased that this was so useful, but if the Version 4 lectures are recorded for online viewing, this will aspect of the course be lost. I asked whether being able to work with a TA remotely (via Skype and desktop sharing) would be an acceptable substitute for in-class hacking, but no one in the room had ever tried it. There’s a lot of research out there into virtual office hours and remote tutoring; I’d like to try a few experiments in May or June to see how well it might work for Software Carpentry.

The two things students like least were the general disorganization of the course and the fact that a lot of the material felt like what we computer scientists though they ought to know, rather than what they could see as being immediately useful. The disorganization reflects the grassroots nature of this round of the course, and the fact that it was our first time teaching in MATLAB. Next time around, we’ll use a more natural order for material in MATLAB, rather than sticking to the order that makes sense for Python, but forces students to grapple with some of the more obscure features of MATLAB early on.

The “eat your vegetables” tone of the material is going to be much harder to deal with. Software Carpentry is meant to be a second course in computing, not an introduction to programming in general: as the last of the user profiles says:

This course is probably too advanced for [a novice], as it assumes familiarity with basic programming concepts like loops, conditionals, arrays, and functions. [They] should probably audit a first-year introduction to programming or find an intensive two-week summer school course before tackling this one.

The problem is that if we’d actually applied that rule last November, we would have turned away more than half of the students, most of whom would never have acquired those basic concepts. So, do we:

  1. ignore the problem and hope that these people will somehow pick up the basics on their own (despite the fact that most scientists never do), or
  2. broaden the course’s mission to include basic programming as well.

My phrasing makes my preference for the second option clear, but feature creep is the biggest risk this project faces. Teaching the basics of Python to people who already know a bit about programming takes 4-5 lectures out of the 25 budgeted; if they don’t know how to program, that figure probably triples, leaving only 10 lecture hours to cover a much-reduced subset of the planned material. On the other hand, sticking to the plan means condemning the majority of potential students to wander lost and frustrated through a bewildering maze of seemingly inconsistent behavior, and to hour upon wasted hour of heartbreaking frustration. (That was a bit melodramatic, but not necessarily inaccurate.)

Another argument against option #2 is pacing. Software Carpentry has been run four times at the University of Toronto (twice as non-credit tutorials and twice as a regular for-credit course). Each time, wide variation in students’ prior experience levels meant that no matter how material was paced, one third of the class would be bored or another third bewildered. On balance, therefore, I think Software Carpentry has to continue to assume a more advanced starting point than most of its potential audience currently has. If things go well, I hope we’ll be able to backfill with more accessible introductory material in a year’s time.


Written by Greg Wilson

2010/04/04 at 01:29

Posted in Toronto, Version 3

8 Responses

Subscribe to comments with RSS.

  1. I’m curious why you’re bothering with MATLAB. It’s closed-source and expensive, whereas a combination of Python (which is already part of the course I see) for running experiments and R for analyzing experiments/running simulations seems like a completely sufficient replacement. It honestly really annoys me to see people wasting resources on closed/paid software, particularly when that encourages departments/universities to spend money on site licenses that could have been put to better use elsewhere.

    Mike Lawrence

    2010/04/04 at 12:05

    • Two reasons:

      1. We want to support MATLAB for the same reason that we’re offering the course in English rather than Esperanto: many times more people already speak it, so the cost *to them* of using the course materials is less.

      2. There are many times more well-tested toolboxes and libraries available for MATLAB than for Python or R, which again lowers the cost to our intended users (less for them to rewrite). Even at $500/seat, a MATLAB license represents only a 3-4 days of a graduate student’s time; I doubt anyone could write and validate a module for doing geomagnetic correction in that time.

      Someone once said that free software is only free if you think your time is worth nothing. If we ignore that, we’ll make less headway than we otherwise might. I personally think the strongest argument for open source in science is checkability, not cheapness; insisting that the software used to produce results be peer reviewed, just as a theorem proof would be, is likely to be more fruitful.

      Greg Wilson

      2010/04/04 at 14:13

      • While MATLAB is a more popular language generally, R/S+ is pretty much the lingua franca of statistics, so that’s where you’re going to find the most up-to-date/advanced/powerful statistics, you’re best off with R.

        Mike Lawrence

        2010/04/08 at 13:16

      • grammar correction to my reply to your reply…

        While MATLAB is a more popular language generally, R/S+ is pretty much the lingua franca of statistics. So if you want the most up-to-date/advanced/powerful statistics, you’re best off with R.

        Mike Lawrence

        2010/04/08 at 13:17

      • Oh, as an example, so far as I know R is the only language with an implementation (via the ggplot2 package) of Wilkinson’s (1999) grammar of graphics, which is an incredibly powerful rule-based approach to visualizing data. Wickham, the author of ggplot2, has even made sure that human visual psychophysics are taken into account (ex. avoiding luminance changes when plotting different colors) when visualizing data.

        Mike Lawrence

        2010/04/08 at 13:51

  2. I have to agree that while the temptation is to move forward with option 2, in my opinion the work done in that area will take away substantially from the strength of the current course. As someone who has benefited greatly from Software Carpentry I would hate to see that happen. I would also say that, at least in Ecology, more and more scientists are picking up the basics on their own or in formal courses. There are now multiple classes at my institution that provide introductions to this level of material in an ecological context and I’ll be adding an actual Intro to Programming course for Biologists in the Fall (using your book). The really unique thing that Software Carpentry offers is the more advanced material and the focus on how to properly develop software.

    @Mike – as a regular beneficiary of MATLAB’s toolboxes (and as someone who wouldn’t have the time, or often the ability, to rewrite them myself) I can testify to the value of this resource.


    2010/04/04 at 15:26

  3. […] 2010/04/08 After Thursday’s post-mortem on the latest offering of Software Carpentry at the Universitiy of Toronto, I had a chance to talk further with Jon Pipitone, who was one of the […]

  4. I wonder if you could point people at some currently-existing tutorials, and say the equivalent of “You need to be _this_ tall to enter this course. If you’re not, here’s where you can grow.”?

    Blake Winton

    2010/04/08 at 02:43

Comments are closed.

%d bloggers like this: