Software Carpentry

Helping scientists make better software since 1997

File Sharing for Scientists

A scientist I recently met in Toronto had a problem: how to share large files with colleagues. Each file is a couple of hundred megabytes; dozens are produced each week, but each is only interesting for a couple of months; and there are confidentiality issues, so some kind of password protection is needed. Conventional file-sharing services like Dropbox aren’t designed for data that size, so in the end she bought a domain and set up secure FTP.

But now there’s this:

BioTorrents: A File Sharing Service for Scientific Data

The transfer of scientific data has emerged as a significant challenge, as datasets continue to grow in size and demand for open access sharing increases. Current methods for file transfer do not scale well for large files and can cause long transfer times. In this study we present BioTorrents, a website that allows open access sharing of scientific data and uses the popular BitTorrent peer-to-peer file sharing technology. BioTorrents allows files to be transferred rapidly due to the sharing of bandwidth across multiple institutions and provides more reliable file transfers due to the built-in error checking of the file sharing technology. BioTorrents contains multiple features, including keyword searching, category browsing, RSS feeds, torrent comments, and a discussion forum. BioTorrents is available at http://www.biotorrents.net.

It’s a neat idea, and will become neater once scientists routinely put DOIs on data as well as papers. I’d be very interested in a usability study to see how easy or hard it is for the average grad student in botany to get this plugged in and turned on.

Advertisements

Written by Greg Wilson

2010/04/16 at 11:59

Posted in Content, Noticed

3 Responses

Subscribe to comments with RSS.

  1. Another favorite option is File Apartment (http://www.fileapartment.com). Easy to use, no software to download or registration, up to 1 GB, free option, safe, and secure.

    Manish M. Shah

    2010/04/16 at 14:01

  2. Neat idea indeed, but not without problems too.

    Interesting to look at the files. A lot of it is software. Many of the files are not too large, and don’t seem to benefit greatly from the speed/hosting
    aspects of bittorrent.

    But especially interesting is that some of the large datasets that are shared appear to be ones that were downloaded from another data portal. (E.g., http://www.biotorrents.net/details.php?id=29 or
    http://www.biotorrents.net/details.php?id=28) Are the main data portals just too slow, or using the wrong technology (ie, failing to use bittorrent)? Also very interesting implications for provenance. Some of
    the descriptions talk about compression by stripping out certain fields. And what happens if there is a change at the main data portal (probably only propagated to the biotorrent system by hand, if noticed at all).

    Archer

    2010/04/16 at 14:05

  3. That is a neat idea. Torrents are extremely useful for sharing large files… Dropbox is pretty good too but the problem is that it gets a little complicated if all you want to do is just email a file or few files.

    But if you want to simply and very easily upload and email a file, you can try something like http://www.dartfiles.com

    Emad

    2010/04/16 at 22:32


Comments are closed.

%d bloggers like this: