Archive for the ‘Lectures’ Category
…I have added a lecture on high performance computing to the revised outline for the course. Several people suggested it, and what’s the point of asking for feedback if I don’t listen?
I’ve posted updates to the revised course outline. In particular, I have:
- Moved testing earlier.
- Clarified intent in a couple of places.
- Made an list of things we’re leaving out.
As always, feedback would be welcome. I’d also be grateful for pointers to places that might fund this work: as I’ve found in the past, many people think the course is a good idea, but it doesn’t quite fit into their funding mandates 😦
I’m grateful to Lorin Hochstein for sending detailed feedback on my proposal to reorganize the course. His comments are below, with my replies and his counter-replies interspersed; more comments would be very welcome.
Content I think you could drop if you wanted to save time:
Read Data Directly From Hardware. I suspect that this would be relevant to only a small minority of your audience. Especially if you’re teaching the course mostly in Python, because this is the sort of thing you should really do in C.
Greg: Agreed; it’s mostly to motivate a discussion of binary data handling, which I guess isn’t that important to most people either.
Greg: Would a title change make it clearer? This is where I wanted to introduce whole-array manipulations (MATLAB-style operations), which I think many scientists do care about.
Lorin: Ah, I didn’t realize this was about MATLAB vectorization (I thought it was related to using an optimizing compiler to take advantage of SIMD instructions). You’re right, this is worth teaching. Back when I was a grad student, I was amazed at the orders of magnitude performance improvement you can get in MATLAB by getting rid of loops and recasting your problems as linear algebra operations. There was a grad student I knew at Boston University who was amazing at turning loops into matrix multiplications.
Clean Up This Code. Great idea for a topic. I’m not sure “cyclomatic complexity” is really that important. I vaguely recall a paper that demonstrated that all complexity metrics correlated very closely with function size, so that “size” is really the most important complexity metric there is.
Greg: The paper is El Emam et al’s “The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics”, and yeah, that’s something I want to add to the lecture.
Test Some Software. I was surprised to see this so late in the curriculum. One of the hardest things I’ve found about unit testing is writing code so that it’s testable. I would have put it up earlier and used unit tests throughout the problems, which would also illustrate how to use unit tests in the different contexts (e.g., unit testing with image analysis). It would also be nice to see some SE testing concepts like category partition testing, code coverage, and fuzz testing.
Greg: I’ve tried that, but given most people’s instinctive aversion to testing, I found that I had to move it later so that I’d built up enough credibility that they’d listen to me 🙂 You’re right, though, I should move it earlier.
Lorin: I think that if you could do nothing else but reduce people’s aversion to testing, the course would still be worth it. 😉 An astounding development (to me, anyways), is how “cool” testing has become in the (agile) software engineering community, unit testing in particular. There are all sorts of testing tools and frameworks everywhere, and many TDD advocates. I don’t have a clue how to transfer this interest to the scientific community, though.
Share Work With Colleagues. In the version control lecture, you note that “this lecture will use a GUI like SmartSVN so that students don’t need to know how to use a shell in order to use version control.” But, don’t the students really need to learn how to use the shell to use many of their tools effectively? You have “Using the Unix Shell” as a topic in the course announcement, but I don’t see it show up as its own topic.
Greg: I’m planning to take the shell out—while I use it all the time, and think most power users do likewise, it didn’t make the cut when the number of lectures was restricted. (And it’s hard to convince someone who’s used to GUIs that the shell is worth learning: the payoff takes a long time to arrive…) If I cut binary data handling and/or vectorization, this is a strong candidate to go back in.
Lorin: That makes sense… It does take a long time before you’re more productive in the shell than the GUI. It’s a shame, though.
XML. You could probably drop XHTML safely. I don’t think it’s that popular in practice, and since most HTML out there is not valid XML, if they tried to use XML-based approaches to do HTML scraping, it would fail pretty quickly. (You really need something like Beautiful Soup to do HTML parsing, but I wouldn’t use that to teach XML!).
Some of the topics I would call “paradigms”, these are going to be hard to fit into a single lecture, such as:
Object-Oriented Programming. I’m torn about this. It’s hard for me to imagine teaching the OOP concepts in a single lecture. I think the Liskov Substitution Principle could probably be dropped (how often does it really come up in practice?) I’m also a little fearful because inheritance tends to be overused in practice. I’d also drop the design patterns (I don’t think they’ll understand OO well enough to observe that at this point), and possibly even the overloading operators.
Greg: I agree that it’s impossible, but everyone asks for it every time the course is taught.
Represent Information. This is a lot of concepts to squeeze into a lecture. If you were to prioritize this, I think database design (and ERD) are more importance in practice than some of the UML stuff. RDF can be safely dropped.
Greg: Good points.
Build a Desktop User Interface. Event-driven programming is a big conceptual leap. I’d probably put state diagrams or statecharts in here. Plus, it’s always very tough to pick a GUI toolkit.
Greg: I was going to use Tkinter—yes, it’s broken, but if the main goal is to teach event-driven programming, it’ll get the idea across without students having to install anything else.
Lorin: Yeah, that sounds reasonable. Tkinter is nice and simple, and it’s a great example of the application of first-class functions. It’s too bad Python doesn’t come with a drag-and-drop GUI builder. When you’re starting out with GUI building, it’s hard to see the advantage of programmatically defining a GUI layout.
Maybe have some content about online resources: where to go to ask a question when you try to apply these and get stuck. StackOverflow, IRC channels, “How to ask questions the smart way”, pastebin.com/pastie.com, showmedo.com, etc. (This really wouldn’t be a full lecture, maybe just a web page on this?)
Personally, I’m bored to tears sitting in a lecture when there’s source code in the slides. I think your ultimate idea of having a self-paced web-based course is a good one. There’s lots of reference material out there on these concepts, but finding worked out examples is rarer. I think the biggest challenge for someone trying these things will be when their personal problem diverges for the example problem in some way and they don’t know how to proceed.
Final question: Have you followed up on previous SC students to see what techniques/practices they adopt after attending the course?
Greg: I did once, but can’t use the data (long story); I’ll be following up with the students from this past July at Christmas to see what’s stuck and what hasn’t. Wish I’d been more systematic in the past, but 20/20 hindsight…
Yesterday afternoon, the students and ninety other guests were treated to six engaging talks about Science 2.0 from Titus Brown, Cameron Neylon, Victoria Stodden, David Rich, Michael Nielsen, and Jon Udell. We’ll post slides and video here as soon as we get them; until then, you can catch up on what happened in the FriendFeed room or by reading Steve Easterbrook’s real-time blog of the event.
Our thanks once again to everyone who made the day possible:
- MaRS for the space,
- MITACS and Cybera for funding,
- SciNet, Steve Easterbrook, and an anonymous donor for additional sponsorship,
- our student volunteers for taking care of all the little things, and most especially
- Jennifer Dodd for organizing it all.
- Andrew Louis has posted some pictures; we’d be grateful for pointers to more.
- Joey deVilla’s notes and photos from Titus Brown’s opening talk.
- …and from Cameron Neylon’s talk on open notebook science (which are echoed at the MSDN Developer Connection blog)…
- …and from Victoria Stodden’s (which are ditto).
- Jon Udell has some thoughts about LaTeX-in-the-web and user innovation.
- Cameron Neylon discusses both the undergrad student demos he saw in the morning, and the Science 2.0 talks from the afternoon.
- Titus Brown’s impressions.
- Andrew Petersen thinks it’ll be a long time coming.
We’re ready for feedback — if you check out the new slides at http://software-carpentry.org, you’ll see a little feedback bubble at the bottom of each topic. Clicking that will give you a chance to send us email to tell us about formatting glitches, factual errors, or anything else you’d like fixed. Please let us know what you think…
In response to several requests, we have updated the license on the course material: the course content is now covered by the Creative Commons Attribution license, while the example code is (still) covered by an open source MIT license. In plain English, this means that you can re-use course content however you want, as long as you give us credit.
A new(er) version of the slides has been posted at http://software-carpentry.org that includes styling changes courtesy of Ryan Feeley. There are still many minor formatting glitches; we’ll fix them in the coming week, and post a schedule showing which lectures are going to be given when.
We’ve also updated the Guest Speakers page with bios and abstracts for the people who’ll be talking at the MaRS Centre in Toronto on July 29. Talks will run 1-6 pm, and will be followed by a wine and cheese. The event is free, but will require advance registration—we’ll post details here as soon as we have them.