Technical and Scientific Computing with Grid Engine: Top Four Things Cisco Learned Working on Open MPI

By Jeff Squyres

This entry was written by guest blogger Jeff Squyres from Cisco Systems. I met him at SC07 when I attended his Open MPI presentation in the Mellanox booth. He did a great job, much better than most of the presentations at tech conferences, and agreed to share some of his thoughts on how big companies can work effectively in an open source project with our readers.

The general idea of my talk is to help address the answer "Why is Cisco contributing to open source in HPC?" Indeed, much of Cisco's code is closed source. Remember that our crown jewels are the various flavors of IOS (the operating system that powers Cisco Ethernet routers); many people are initially puzzled as to why Cisco is involved in open source projects in HPC.

The short/obvious answer is: it helps us.

Cisco is a company that needs to make money and has a responsibility to its stockholders. We sell products in the HPC space and therefore need a rock-solid, high-performance MPI that works well on our networks. Many customers demand an open source solution, so it is in our best interests to help provide one rather than partially or wholly rely on someone else to provide one. In particular, some of these interests include (but are not limited to):

Having engineers at Cisco who can providing direct support to our customers who use open source products
Being able to participate in the process and direction of open source projects that are important to us (vs. being an outsider)
Leveraging the development and QA resources of both our partners and competitors -- effectively having our efforts magnified by the open source community (and vice versa)
Shortening the time between research and productization; working directly with our academic partners to turn today's whacky ideas into tomorrow's common technology

Think of it this way: only certain parties can mass-produce high quality hardware for HPC (i.e., vendors). But *many* people can help produce high quality software -- not just vendors. In the context of this talk, customers (including research and academic customers) have the expertise and capability to *directly* contribute to the software that runs on our hardware. HPC history has proven this point. We'd therefore be foolish to *not* engage HPC-smart customers, researchers, academics, partners, competitors, ...anyone who has an HPC expertise to help make our products better. I certainly cannot speak for others, but I suspect that this rationale is similar to why other vendors participate in HPC open source as well.

Let's not forget that participation in HPC open source helps everyone -- to include the overall size of the HPC market. Here's one example: inter-vendor collaboration, standardization, and better interoperability means happy customers. And happy customers lead to more [happy] customers.

We have learned many things while participating in large open source projects. Below are a few of the nuggets of wisdom that we have earned (and learned). In hindsight, some are obvious, but some are not:

Open source is not "free" -- someone has to pay. By spreading the costs among many organizations, we can all get a much better rate of return on our investment.
Consensus is good among the members of an open source community (e.g., some members are only participating out of good will), but not always possible. Conflict resolution measures are necessary (and sometimes critical).
Just because a project is open source does not guarantee that it is high quality. Those who are interested in a particular part of a project (especially large, complex projects where no single member knows or cares about every aspect of the code base) need to look after it and ensure its quality over time.
Differences are good. The entire first year of the Open MPI project was a struggle because the members came from different backgrounds, biases, and held different core fundamentals to be true. It took a long time to realize that exactly these differences are what make a good open source project strong. Heterogeneity is good; differences of opinion are good. They lead to discussion and resolution down to hard, technical facts (vs. religion or "it's true because I've always thought it was true"), which results in better code.

True open source collaboration is like a marriage: it takes work. A lot of hard, hard work. Disagreements occur, mistakes happen, and misunderstandings are inevitable (particularly when not everyone speaks the same native language). But when it all clicks together, the overall result is really, really great. It makes all the hard work well worth it.

Technical and Scientific Computing with Grid Engine

Monday, December 17, 2007

Top Four Things Cisco Learned Working on Open MPI

No comments:

Post a Comment