why Debian for scientific computing: a case study

Yesterday I've been invited to visit EDF R&D center at Clamart, near Paris. They wanted to discuss their Debian usage and present some of the cool stuff they're doing. The most interesting component is an in-house Debian-based distribution called "calibre", which has been presented at RMLL 2008.

Even though it is now growing desktop profiles (currently deployed on about 1'200 desktops and counting), calibre was mainly developed for clusters dedicated to scientific computing. Current cluster deployments at EDF are not that big, but still comprise hundreds of machines for about 40 teraFLOPS, with their largest cluster in Top 500. The main goal of calibre was to quickly bring a complete cluster from the bare metal to production state. The goal has been quite successfully achieved: using Debian and FAI they get a cluster of 200 machines ready for production in about 1 hour and a half, installing more than 3'000 packages on each machine (as the cluster will be used for heterogeneous purposes, rather than for a handful of specific applications).

What I found most interesting of the visit are the reasons for choosing Debian over other (commercial) distros for their scientific computing purposes:

  • They use a wide range of open source scientific softwares (some developed in house): according to their claims Debian is the mainstream distribution with the largest offering of such software, with the additional benefit that corresponding Debian maintainers are experts of the software they package, so that they can trust them. They have kudos for Debian Science, which I'm happy to proxy.

  • They need to rebuild packages to trigger specific optimizations for their clusters. On one hand, that defeats the typical management argument of "commercial support" that other distros offer, as rebuilding packages void support guarantees.

  • On the other hand, it really helps them the focus on quality that we do have on Debian: we fight FTBFSs to death, and people which need to rebuild our packages really appreciate that.

EDF is generally keen of contributing back to Debian (even though the team behind calibre is still small), and I've been happy to walk them through how they can contribute.

The last interesting feedback I've to share, is that they feel a bit alone in what they're doing (which is unsurprisingly, given that their communication on the matter has been rather limited thus far ...). Still, there is probably room for synergies that can be better exploited among users with similar needs. So, are you a cluster / scientific computing user of Debian? Then let me know, and I'll be happy to get you in touch with EDF and other users with similar interests.