Monday, October 22, 2007

No CPU Left Behind

By Borja Sotomayor


For some time now, I've been really interested in the potential applications of grid computing in higher education and, possibly, in secondary education. So, I was really intrigued when I read about Google and IBM's computing cloud for students. Just looking at the headline, my first impression was that students anywhere would be able to have their own computing cloud to use as a playground for learning and experimentation. As it turns out, Google and IBM's computing cloud will be initially used by only five universities, with the goal of giving students a platform in which to learn about parallel programming and Internet-scale applications. Although still a very cool project, I thought this would be a good opportunity to share some ideas of how grid computing could end up benefiting education. Like fellow gridguru Tim Freeman, I'm a part of the Globus Virtual Workspaces project, so my ideas are biased towards how grid computing and workspaces could benefit education.



I have talked with many Computer Science and Engineering lecturers and professors at small colleges and universities who cannot teach certain courses for lack of computing resources. For example, while teaching an introductory programming course requires minimal computing resources (such as a computer lab), teaching a course on parallel programming or distributed systems may require more expensive resources. To get students to practice parallel programming in a somewhat realistic setting, you would like them to have access to a properly configured and maintained cluster. If, furthermore, you wanted to teach students how to set up a cluster, you would need a couple of clusters (ideally, one cluster per student) that the students could have unfettered access to.



There are two main issues with the above scenario. First of all, clusters aren't generally cheap, and some institutions can't afford one. Of course, you can easily build a cluster out of commodity hardware, but you also need someone to actually set it up and jiggle the handle whenever something goes awry. In one specific case, a department built a cluster with off-the-shelf PCs, and used it successfully... until the grad student charged with keeping the cluster running graduating. Apparently, that cluster has been sitting idly in a room for years now. Second, even if the institution can afford a cluster and a sysadmin, no sysadmin in his right mind is going to give root access to that cluster to undergrads, specially if that cluster is also used by researchers.



Enter virtual workspaces. In a nutshell, a virtual workspaces is an execution environment that you can dynamically and securely deploy on the grid with exactly the hardware and software you need. You need a 32-node dual CPU Linux cluster for a couple of hours to teach a parallel programming lab, with a very specific version of libfoobar installed on it? Just request a workspace for it, and that hardware will be allocated somewhere on the grid for you, and the software will be set up thanks to software contextualization, which Tim will discuss in his posts. There's no need for the institution to keep a cluster running 24/7, or even spend any time configuring a cluster (requiring a sysadmin, or burdening the lecturer or a grad student with this task). From a repository of ready-made workspaces, simply choose the one you want (or pay a one-time fee to have someone configure a workspace exactly the way you want it), deploy it on the grid ever Monday from 2pm to 4pm, and start teaching.



Unfortunately, we're not quite there yet, but virtual workspaces are being actively researched (yes, right now, even as you read this blog post!). Currently, virtual machines are the most promising vehicle to automagically stand up these custom execution environments on a grid. The Globus Virtual Workspaces Service, which uses the Xen VMM to instantiate workspaces, is still in a Technology Preview phase so, although you can still do a number of very cool things with it, you can't deploy arbitrary workspaces on arbitrary grids... yet. However, we're getting much closer, and in future blog posts I'll explain what progress we're making towards that goal.



When we do get there, I believe that workspaces stand to make really exciting contributions to Computer Science and Engineering education. Not only can they facilitate access to computational resources by underprivileged institutions, they can also enhance existing curriculums by enabling students to gain more practical experience than before (e.g., by giving each student their own cluster). In fact, workspaces will enable the creation of more complex "playgrounds", from virtual clusters to virtual grids, that students can use to learn and experiment.

2 comments:

  1. Wrong link for virtual workspaces at globus. The right one is:
    http://workspace.globus.org

    ReplyDelete
  2. Thanks for reporting this. The link now points to the correct URL.

    ReplyDelete