Technical and Scientific Computing with Grid Engine: October 2007

Tuesday, October 30, 2007

How to Decipher Grid Engine Statuses – Part I

By Sinisa Veseli

In all likelihood most of the Grid Engine (GE) end users and administrators have at some point invoked the qstat command and found themselves wondering what do some of the resulting queue and job status letters mean. While some of those letters are pretty intuitive (e.g., ‘E’ stands for error), some are not entirely trivial to decipher. Unfortunately, it does not seem to be very easy to find explanation for these statuses. One usually has to resort to digging through the qstat man pages or through the various GE software manuals that one can find on the web. So, I’ve compiled below information about possible queue statuses:

• a (alarm) – At least one of the load thresholds defined in the load_thresholds list of the queue configuration is currently exceeded. This state prevents GE from scheduling further jobs to that queue. You can find the reason for the alarm state using the qstat command with “-explain a” option.

• A (Alarm) – At least one of the suspend thresholds of the queue is currently exceeded. This state causes jobs running in that queue to be successively suspended until no threshold is violated. You can see the reason for this state using the qstat command with “-explain A” option.

• c (configuration ambiguous) – The queue instance configuration (specified in GE configuration files) is ambiguous. The state resolves when the configuration becomes unambiguous again. This state prevents you from scheduling further jobs to that queue instance. You can find detailed reasons why a queue instance entered this state in the sge_qmaster messages file, or by using the qstat command with “-explain c” option. For queue instances in this state, the cluster queue's default settings are used for the ambiguous attribute.

• C (Calendar suspended) – The queue has been suspended automatically using the GE calendar facility.

• d (disabled) – Queues are disabled and released using the qmod command. Disabling a queue will prevent new jobs to be scheduled for execution in that queue, but it will not affect jobs that are already running there.

• D (Disabled) – The queue has been disabled automatically using the GE calendar facility.

• E (Error) – The queue is in the error state. You can find the reason for this state using the qstat command with “-explain E” option. Check that daemon's error log for information on how to resolve the problem, and clear the queue state afterwards using the qmod command with the -cq option.

• o (orphaned) – The current cluster queue's configuration and host group configuration no longer needs this queue instance. The queue instance is kept because unfinished jobs are still associated with it. The orphaned state prevents you from scheduling further jobs to that queue instance. It disappears from qstat output when these jobs finish. To help resolve an orphaned queue instance associated with a job, you use the qdel command. You can revive an orphaned queue instance by changing the cluster queue configuration so that the configuration covers that queue instance.

• s (suspended) – Queues are suspended and un-suspended using the qmod command. Suspending a queue suspends all jobs executing in that queue.

• S (Subordinate) – The queue has been suspended due to subordination to another queue. When queue is suspended, regardless of the cause, all jobs executing in that queue are suspended too.

• u (unknown) – The corresponding GE execution daemon (sge_execd) cannot be contacted.

I hope that those who are new to Grid Engine find the above descriptions useful. In Part II of this article I will cover possible job statuses.

Friday, October 26, 2007

Dream Big, Dream Grid

By Ivo Janssen

Last time we talked about two similar
yet different benefits of using grids. Today we will expand on that list with
other benefits you might not have yet thought about. Just to be clear, we’re
purely talking about technical benefits here, the business benefits are left
for a whole other column.

Let’s first review what we found last time. The obvious
benefits revolve around speedup of your parallel applications and higher
throughput of your batch jobs. A typical example of the former is a
crash-simulation with PAM-CRASH and MPI, a typical example of the latter is
doing virtual high-throughput screening with applications such as LigandFit
from Accelrys, where many potential drug targets are screened against a single
protein target. But there are other less obvious use-cases for grid that can
benefit you.

Imagine running a simulation that has many tweakable parameters
that you’ve always set to a pre-set value. When you now move your computations
to a grid, you might not need to get your results back any faster, so you could
now opt to increase the accuracy of your computation by running the same
simulation with different parameter sweeps on different nodes. Further expansion
of your grid will suddenly increase the validity and accuracy of your results, rather
than decrease runtime. An example of such computation can be found in the Oil
and Gas industry where a more refined and accurate computational model of an
oil-field can prevent costly dry holes.

One could assert that Monte Carlo situations are in fact
also "accuracy-increasing" applications of grid, but there are two subtle
differences. First, Monte Carlo simulations run usually on a much more massive
scale, with thousands of very short simulations, where parameter sweep modeling
typically utilizes larger models on a limited (less than a hundred) number of
iterations. Second, typical Monte Carlo simulations only end once a pre-set certain resolution has been achieved, regardless of the number of grid nodes to your disposal. As such, it is better
to categorize Monte Carlo simulations in the "throughput" category.

Once you understand these three basic benefits (speed-up,
throughput and accuracy), there’s really no limit to what your imagination can
come up with in terms of new applications of grid. Take the Ligandfit example
that I mentioned earlier. United Devices' recently retired grid.org looked at
the throughput use-case and took it to the extreme by simply taking a protein
crucial to the internal workings of cancer cells and running every single
possible potential drug target in the library against that protein. It took a
leap of imagination to dream up six years of running billions of drug targets
against multiple proteins.

The most rewarding moment during a consulting engagement is
when I see that users "get" the basic use-cases and start dreaming big. Can you
dream big? What can the grid do for you?

Monday, October 22, 2007

No CPU Left Behind

By Borja Sotomayor

For some time now, I've been really interested in the potential applications of grid computing in higher education and, possibly, in secondary education. So, I was really intrigued when I read about Google and IBM's computing cloud for students. Just looking at the headline, my first impression was that students anywhere would be able to have their own computing cloud to use as a playground for learning and experimentation. As it turns out, Google and IBM's computing cloud will be initially used by only five universities, with the goal of giving students a platform in which to learn about parallel programming and Internet-scale applications. Although still a very cool project, I thought this would be a good opportunity to share some ideas of how grid computing could end up benefiting education. Like fellow gridguru Tim Freeman, I'm a part of the Globus Virtual Workspaces project, so my ideas are biased towards how grid computing and workspaces could benefit education.

I have talked with many Computer Science and Engineering lecturers and professors at small colleges and universities who cannot teach certain courses for lack of computing resources. For example, while teaching an introductory programming course requires minimal computing resources (such as a computer lab), teaching a course on parallel programming or distributed systems may require more expensive resources. To get students to practice parallel programming in a somewhat realistic setting, you would like them to have access to a properly configured and maintained cluster. If, furthermore, you wanted to teach students how to set up a cluster, you would need a couple of clusters (ideally, one cluster per student) that the students could have unfettered access to.

There are two main issues with the above scenario. First of all, clusters aren't generally cheap, and some institutions can't afford one. Of course, you can easily build a cluster out of commodity hardware, but you also need someone to actually set it up and jiggle the handle whenever something goes awry. In one specific case, a department built a cluster with off-the-shelf PCs, and used it successfully... until the grad student charged with keeping the cluster running graduating. Apparently, that cluster has been sitting idly in a room for years now. Second, even if the institution can afford a cluster and a sysadmin, no sysadmin in his right mind is going to give root access to that cluster to undergrads, specially if that cluster is also used by researchers.

Enter virtual workspaces. In a nutshell, a virtual workspaces is an execution environment that you can dynamically and securely deploy on the grid with exactly the hardware and software you need. You need a 32-node dual CPU Linux cluster for a couple of hours to teach a parallel programming lab, with a very specific version of libfoobar installed on it? Just request a workspace for it, and that hardware will be allocated somewhere on the grid for you, and the software will be set up thanks to software contextualization, which Tim will discuss in his posts. There's no need for the institution to keep a cluster running 24/7, or even spend any time configuring a cluster (requiring a sysadmin, or burdening the lecturer or a grad student with this task). From a repository of ready-made workspaces, simply choose the one you want (or pay a one-time fee to have someone configure a workspace exactly the way you want it), deploy it on the grid ever Monday from 2pm to 4pm, and start teaching.

Unfortunately, we're not quite there yet, but virtual workspaces are being actively researched (yes, right now, even as you read this blog post!). Currently, virtual machines are the most promising vehicle to automagically stand up these custom execution environments on a grid. The Globus Virtual Workspaces Service, which uses the Xen VMM to instantiate workspaces, is still in a Technology Preview phase so, although you can still do a number of very cool things with it, you can't deploy arbitrary workspaces on arbitrary grids... yet. However, we're getting much closer, and in future blog posts I'll explain what progress we're making towards that goal.

When we do get there, I believe that workspaces stand to make really exciting contributions to Computer Science and Engineering education. Not only can they facilitate access to computational resources by underprivileged institutions, they can also enhance existing curriculums by enabling students to gain more practical experience than before (e.g., by giving each student their own cluster). In fact, workspaces will enable the creation of more complex "playgrounds", from virtual clusters to virtual grids, that students can use to learn and experiment.

Tuesday, October 16, 2007

Does your grid make Fords or Volvos?

By Ivo Janssen

Ask a user why they use a grid, a cluster, or any other type
of distributed system and you’ll hear, “Why, to get my work done faster, of
course.” But that’s an ambiguous statement at best, since it can mean two things:
faster runtimes or higher throughput. And although they might seem similar,
they’re really not.

Runtime is defined as the wallclock time it takes to
complete one task. If you parallelize a task, for instance with MPI, or by
taking advantage of the data splitting capabilities of Grid MP, you can get
your job back in less time. If you can parallelize your job into 10 parallel
sub-jobs and run it on 10 nodes, you can expect that job to complete on average
in 1/10th of the time. Plus a bit of overhead of course, but let’s keep it
simple for now. In Volvo’s innovative Uddevalla
plant, groups of workers assemble entire automobiles in less time than it takes
for one worker to complete a whole car. So with 10 workers in a group, you
could potentially make a car in 1/10th of the time.

However, sometimes your task cannot be parallelized any further,
but you might have lots of them pending. Grids can still help since they can
increase the throughput of your jobs. Queuing theory states that with 10 nodes
and 10 jobs, you can still expect a unique job to complete on average in 1/10th
of the runtime of a single job, without using any parallelism. In a traditional
American automotive plant, the car advances on the assembly line and at no
point more than one operator is working on one car, so there’s no parallelism
involved. It might take up to a day before one car is completed from start to
finish, but a new car rolls off the end of the line every few minutes.

So next time when a user brags about his fancy new cluster,
ask him whether he’s producing Fords or Volvos.

Thursday, October 11, 2007

Virtual Grid Nodes: The Tension

By Roderick Flores

Lately I have been putting a lot of thought into the challenges that grid managers face in building an enterprise grid. Primarily they must support the various stakeholders throughout the enterprise, each of whom has their own sets of application workflows used to meet their business needs.

The software packages that each interested group uses may have a significant overlap with one another, but the similarity stops there. Because each group ostensibly has a different goal, the usage patterns are almost guaranteed to be unique. This implies that the community as a whole will demand any of the following:

A wide range of operating systems including Linux, Microsoft Windows, or any of the varied flavors of Unix;
Support for multiple versions of the same software package; and
A wide range of operating environments particularly with respect to memory, CPU performance, network usage, and storage.

When you consider users’ needs in more detail, you will recognize that a number of implications further complicate things:

The set of applications that users wish to run will likely run under a two or more different major OS revisions (e.g. Linux kernel 2.4 versus 2.6 or Windows XP versus Vista);
Similarly, there are applications that steadfastly refuse to run under a specific patch level. For example, a minor revision of the Linux kernel that is lacking a specific security patch might be required. You might be able to force the software to install but then the software is likely to no longer be supported;
Off-the-shelf installations which seek to upgrade rather than coexist with a previous version;
Custom software that expects a very specific behavior from a package that has changed in its most recent update;
Software which requires particular kernel tuning which is not appropriate for general operation; and
Software packages which have 32/64-bit library compatibility issues;

Meanwhile, grid managers will most likely be focused on providing a stable, secure, and easy to maintain infrastructure that is both cost-effective and capable of meeting the users’ core requirements. Clearly the priorities between the individual groups and the support team will be at odds much of the time.

The most elegant solution to these issues is to build a grid whose execution environments are all virtualized. In this situation, each usage pattern would have its own environment tailored to its own unique needs while the core OS would be under the complete control of the infrastructure staff. Clearly there would be a stakeholder driven set of virtual servers available for use on each node in the grid.

It seems simple enough: rather than creating a complicated infrastructure that will not accommodate all of the situations your users will require, you simply will give them their own isolated operating environments. As you might expect, nothing is that straightforward. The standard tools that you use for grid and virtualization management do not work well in this architecture.

In future posts, we will explore the challenges and possible solutions in detail. In particular we will focus on:

- Networking
- Virtual Server Management
- Job Scheduling
- Performance Monitoring
- Security
- Data Lifecycle

Friday, October 5, 2007

Scripting Grid Engine Administrative Tasks Made Simple

By Sinisa Veseli

Grid Engine (GE) is becoming increasingly popular software for distributed resource management. Although it comes with a GUI that can be used for various administrative and configuration tasks, the fact that all of those tasks can be scripted is very appealing. The GE Scripting HOWTO document already contains a few examples to get one started, but I wanted to further illustrate the usefulness of this GE feature with a simple example of a utility that modifies shell start mode for all queues in the system:

#!/bin/sh

# Utility to modify shell start mode for all GE queues.
# Usage: modify_shell_start_mode.sh 
#  can be one of unix_behavior, posix_compliant or script_from_stdin

# Temporary config file.
tmpFile=/tmp/sge_q.$$

# Get new mode.
newMode=$1

# Modify all known queues.
for q in `qconf -sql`; do
# Prepare queue modification.
echo "Modifying queue: $q"
cmd=”qconf -sq $q | sed 's?shell_start_mode.*?shell_start_mode $newMode?' > $tmpFile”
eval $cmd

# Modify queue.
qconf -Mq $tmpFile

# Cleanup.
rm -f $tmpFile
done

Using the above script one can quickly modify the variable for all queues without having to go through the manual configuration steps.

The basic approach of 1) preparing new configuration file by modifying the current object configuration, and 2) reconfiguring GE using the prepared file, works for a wide variety of tasks. There are cases, however, in which the desired object does not exist and has to be added. Those cases can be handled by modifying the EDITOR environment variable and invoking the appropriate qconf command. For example, here is a simple script that creates set of new queues from the command line:

#!/bin/sh

# Utility to add new queues automatically.
# Usage: add_queue.sh   …

# Force non-interactive mode.
EDITOR=/bin/cat; export EDITOR

# Get new queue names.
newQueues=$@

# Add new queues.
for q in $newQueues; do
echo "Adding queue: $q"
qconf -aq $q
done

Utilities like the ones shown here get written once and usually quickly become indispensable tools for experienced GE administrators.