Technical and Scientific Computing with Grid Engine: Web/Tech

Showing posts with label Web/Tech. Show all posts

Monday, November 17, 2008

Automating Grid Engine Monitoring

By Sinisa Veseli

When visiting client sites I often notice various issues with the existing distributed resource management software installations. The problems usually vary from configuration issues to queues in an error state. While things like inadequate resources and queue structure usually require more analysis and better design, problems like queues in an error state are easily detectable. So, cluster administrators, who are often busy with many other duties, should try to automate monitoring tasks as much as they can. For example, if you are using Grid Engine, you can easily come up with scripts like the one below, which looks for several different kinds of problems in your SGE installation:

#!/bin/sh

. /usr/local/unicluster/unicluster-user-env.sh

explainProblem() {
qHost=$1   # queue where the problem is found
msg=`qstat -f -q $qHost -explain aAEc | tail -1 | sed 's?-??g' | sed '/^$/d'`
echo $msg
}

checkProblem() {
description=$1  # problem description
signature=$2    # problem signature
for q in `qconf -sql`; do
cmd="qstat -f -q $q | grep $q | awk '{if(NF>5 && index(\$NF, \"$signature\")>0) print \$1}'"
qHostList=`eval $cmd`
if [ "$qHostList" != "" ]; then
for qHost in $qHostList; do
msg=`explainProblem $qHost`
echo "$description on $qHost:"
echo "  $msg"
echo ""
done
fi
done
}

echo "Grid Engine Issue Summary"
echo "========================="
echo ""
checkProblem Error E
checkProblem SuspendThreshold A
checkProblem Alarm a
checkProblem ConfigProblem c

Note that the above script should work with Unicluster Express 3.2 installed in the default (/usr/local/unicluster) location. It can be easily modified to, for example, send email to administrators in case problems are found that need attention. Although simple, such scripts usually go long way towards ensuring that your Grid Engine installation operates smoothly.

Monday, February 18, 2008

How to Monitor Grid Engine

By Sinisa Veseli

You have built and installed your shiny new cluster, installed the Grid Engine software, configured the queues, and announced to the world that your new system is ready to be used. What next? Well, think about your monitoring options…

As users start submitting jobs and hammering the system in every possible way, things will inevitably break on occasion. When something goes wrong in the system, you will want to know about the problem before you start receiving help desk calls and user emails.

The first step in developing an effective strategy for monitoring Grid Engine is learning how to use the available command line tools and how to look for possible issues in the system. Some of the things that you should always pay attention to include:

• queues in the unknown state; instance queue in an unknown state usually means that execution daemon is down on that particular host

• queues and jobs in the error state

• configuration inconsistencies

• load alarms

All of the above information can be easily obtained using the qstat command (e.g., try something like “qstat -f -qs uaAcE -explain aAcE”). It is also not difficult to script basic GE monitoring tasks and come up with a simple infrastructure that is able to alert system administrators to any new or outstanding problems in the system.

As your user base grows, so will your monitoring needs, and you will likely want to extend your monitoring tools. You should consider looking into existing software packages like xml-qstat, which uses XSLT transformations to render Grid Engine command line XML output into different output formats. Alternatively, you can also develop set of your own XSL stylesheets that are customized to your needs, and use widely available command line tools such as xsltproc to generate monitoring web pages from the “qstat -xml” output.

Another interesting Grid Engine monitoring option is the Monitoring Console that comes with Cluster Express (CE). Its main advantage is that it integrates monitoring data from several different sources: Ganglia (system data), Grid Engine Qmaster and ARCo database (job data). However, even though the Cluster Express by itself is easy to install, at the moment integrating the CE Monitoring Console with existing Grid Engine installation requires a little bit of work. I am told that this will be much simplified in the upcoming CE release. In the meantime, if you are really anxious to try the CE Monitoring GUI on your Grid Engine cluster, do not hesitate to send me an email…

Tuesday, February 5, 2008

How To Write Your Own Load Sensors For Grid Engine

By Sinisa Veseli

As most Grid Engine (GE) administrators know, the GE execution daemon periodically reports values for a number of host load parameters. Those values are stored in the qmaster internal host object, and are used internally (e.g., for job scheduling) if a complex resource attribute with a corresponding name is defined.

Parameters that are reported by default (about 20 or so) are documented in the file $SGE_ROOT/doc/load_parameters.asc. For a large number of clusters the default set is sufficient to adequately describe their load situation. However, for those sites where this is not the case, the Grid Engine software provides administrators with the ability to introduce additional custom load parameters. Accomplishing this task is not difficult, and involves three steps:

1) Provide custom load sensor. This can be a script or a binary that feeds the GE execution daemon with additional load information. It must comply with the following rules:

• It should be written as an infinite loop that waits for user input from the standard input stream.

• If the string “quit” is received, the sensor should exit.

• Otherwise, it should retrieve data necessary for computing the desired load figures, calculate those, and write them to the standard output stream.

• The individual host-related load figures should be reported one per line and in the form “::” (without any blanks). The load figures should be enclosed with a pair of lines containing only “begin” and “end” strings. For example, custom load sensor running on the machine tolkien.univaud.com and measuring parameters n_app_users and n_app_threads might show the following output:

begin

tolkien.univaud.com:n_app_users:12

tolkien.univaud.com:n_app_threads:23

end

Note that for global consumable resources not attached to each host (such as, for example, the number of used floating licenses), the load sensor needs to output string “global” instead of the machine name.

2) For each custom load parameter define complex resource attribute using, for example, the “qconf -mc” command.

3) Enable custom load sensor by executing the “qconf -mconf” command and providing the full path to your script or executable as value for the “load_sensor” parameter. If all goes well, the execution daemon will start reporting the new load parameters within a minute or two, and you should be able to see them using the “qhost -F ” command.

Administrators with decent scripting skills (or those with a bit of luck ☺) can usually implement and enable new load sensors for their Grid Engine installations in a very short period of time. Note that some simple examples for custom load sensors can be found in the Grid Engine Admin Guide, as well as in the corresponding HowTo document.

Monday, December 3, 2007

How to Enable Rescheduling of Grid Engine Jobs after Machine Failures

By Sinisa Veseli

Checkpointing is one of the most useful features that Grid Engine (GE) offers. As status of checkpointed jobs is periodically saved to disk, those jobs can be restarted from the checkpoint in case they do not finish for some reason (e.g., due to a system crash). In this way, any possible loss of processing for long running jobs is limited to a few minutes, as opposed to hours or even days.

When learning about Grid Engine checkpointing I found the corresponding HowTo to be extremely useful. However, this document does not contain all the details necessary to enable checkpointed job rescheduling after machine failure. If you'd like to enable that feature, you should do the following:

1) Configure your checkpointing environment using “qconf -mckpt” command (use “qconf -ackpt” for adding a new environment), and make sure that the environment’s “when” parameter includes letter ‘r’ (for “reschedule”). Alternatively, if you are using the “qmon” GUI, make sure that the “Reschedule Job” box is checked in the checkpoint object dialog box.

2) Use “qconf -mconf” command (or the “qmon” GUI) to edit the global cluster configuration and set the “reschedule_unknown” parameter to a non-zero time. This parameter determines whether jobs on hosts in unknown state are rescheduled and thus sent to other hosts. The special (default) value of 00:00:00 means that jobs will not be rescheduled from the host on which they were originally running.

3) Rescheduling is only initiated for jobs that have activated the rerun flag. Therefore, you must make sure that checkpointed jobs are submitted with “-r y” option of the “qsub” command, in addition to the “-ckpt < ckpt_env_name >” option.

Note that jobs that are not using checkpointing will be rescheduled only if they are running in queues that have the “rerun” option set to true, in addition to being submitted with “-r y” option. Parallel jobs are only rescheduled if the host on which their master task executes gets into an unknown state.

Saturday, December 1, 2007

Reservation Features Come to Grid Engine

By Sinisa Veseli

The next major update release of the Grid Engine software will contain advance reservation (AR) features (see the original announcement). This functionality will allow users or administrators to manipulate reservations of specific resources for future use. More specifically, users will be able to request new AR, delete existing AR, and show granted ARs. The reserved resources will only be available for special jobs as of the reservation start time.

In order to support the AR features the new set of command line interfaces is being introduced (qrsub, qrdel and qrstat). Additionally, the existing commands like qsub will be getting new switches, and the qmon GUI will be getting a new panel that will allow submitting, deleting, and listing AR requests. It is also worth noting that the default qstat output might change.

If you are anxious to try it out, the latest Grid Engine 6.1 snapshot binaries containing new AR features is available for download here. Note, however, that this snapshot (based on the 6.1u2 release) is not compatible with prior snapshots or versions, and that an upgrade procedure is currently not available.

Monday, November 5, 2007

How to Decipher Grid Engine Statuses – Part II

Sinisa Veseli

In Part I of this article I’ve discussed meanings of various queue states that one might see after invoking the Grid Engine qstat command. The list of possible job states is just as long as the list of queue states:

• d (deletion) — Indicates that a job has been deleted using qdel.

• r (running) — Indicates that a job is about to be executed or is already executing.

• R (restarted) — Indicates that the job was restarted. This state can be caused by a job migration or because of one of the reasons described in the -r section of the qsub man page.

• s (suspended) — Shows that an already running job has been suspended using qmod.

• S (suspended) — Show that an already running job has been suspended because the queue that it belongs to has been suspended.

• t (transferring) — Indicates that a job is about to be executed or is already executing.

• T (threshold) — Show that an already running job has been suspended because at least one suspend threshold of the corresponding queue was exceeded, and that the job has been suspended as a consequence.

• w (waiting) — Indicates that the job is suspended pending the availability of a critical resource or specified condition.

• q (queued) — Indicates that the job has been queued.

• E (error) — Indicates that the job is in the error state. You can find the reason for this state using the qstat command with “-explain E” option.

• h (hold) — Indicates that the job is not eligible for execution due to a hold state assigned to it via qhold, qalter, or qsub -h command.

Just like with queue states, one also frequently encounters various combinations of the above job states.

Friday, October 5, 2007

Scripting Grid Engine Administrative Tasks Made Simple

By Sinisa Veseli

Grid Engine (GE) is becoming increasingly popular software for distributed resource management. Although it comes with a GUI that can be used for various administrative and configuration tasks, the fact that all of those tasks can be scripted is very appealing. The GE Scripting HOWTO document already contains a few examples to get one started, but I wanted to further illustrate the usefulness of this GE feature with a simple example of a utility that modifies shell start mode for all queues in the system:

#!/bin/sh

# Utility to modify shell start mode for all GE queues.
# Usage: modify_shell_start_mode.sh 
#  can be one of unix_behavior, posix_compliant or script_from_stdin

# Temporary config file.
tmpFile=/tmp/sge_q.$$

# Get new mode.
newMode=$1

# Modify all known queues.
for q in `qconf -sql`; do
# Prepare queue modification.
echo "Modifying queue: $q"
cmd=”qconf -sq $q | sed 's?shell_start_mode.*?shell_start_mode $newMode?' > $tmpFile”
eval $cmd

# Modify queue.
qconf -Mq $tmpFile

# Cleanup.
rm -f $tmpFile
done

Using the above script one can quickly modify the variable for all queues without having to go through the manual configuration steps.

The basic approach of 1) preparing new configuration file by modifying the current object configuration, and 2) reconfiguring GE using the prepared file, works for a wide variety of tasks. There are cases, however, in which the desired object does not exist and has to be added. Those cases can be handled by modifying the EDITOR environment variable and invoking the appropriate qconf command. For example, here is a simple script that creates set of new queues from the command line:

#!/bin/sh

# Utility to add new queues automatically.
# Usage: add_queue.sh   …

# Force non-interactive mode.
EDITOR=/bin/cat; export EDITOR

# Get new queue names.
newQueues=$@

# Add new queues.
for q in $newQueues; do
echo "Adding queue: $q"
qconf -aq $q
done

Utilities like the ones shown here get written once and usually quickly become indispensable tools for experienced GE administrators.