Friday, October 26, 2007

Dream Big, Dream Grid

By Ivo Janssen

Last time we talked about two similar
yet different benefits of using grids. Today we will expand on that list with
other benefits you might not have yet thought about. Just to be clear, we’re
purely talking about technical benefits here, the business benefits are left
for a whole other column.



Let’s first review what we found last time. The obvious
benefits revolve around speedup of your parallel applications and higher
throughput of your batch jobs. A typical example of the former is a
crash-simulation with PAM-CRASH and MPI, a typical example of the latter is
doing virtual high-throughput screening with applications such as LigandFit
from Accelrys, where many potential drug targets are screened against a single
protein target. But there are other less obvious use-cases for grid that can
benefit you.



Imagine running a simulation that has many tweakable parameters
that you’ve always set to a pre-set value. When you now move your computations
to a grid, you might not need to get your results back any faster, so you could
now opt to increase the accuracy of your computation by running the same
simulation with different parameter sweeps on different nodes. Further expansion
of your grid will suddenly increase the validity and accuracy of your results, rather
than decrease runtime. An example of such computation can be found in the Oil
and Gas industry where a more refined and accurate computational model of an
oil-field can prevent costly dry holes.



One could assert that Monte Carlo situations are in fact
also "accuracy-increasing" applications of grid, but there are two subtle
differences. First, Monte Carlo simulations run usually on a much more massive
scale, with thousands of very short simulations, where parameter sweep modeling
typically utilizes larger models on a limited (less than a hundred) number of
iterations. Second, typical Monte Carlo simulations only end once a pre-set certain resolution has been achieved,  regardless of the number of grid nodes to your disposal. As such, it is better
to categorize Monte Carlo simulations in the "throughput" category.



Once you understand these three basic benefits (speed-up,
throughput and accuracy), there’s really no limit to what your imagination can
come up with in terms of new applications of grid. Take the Ligandfit example
that I mentioned earlier. United Devices' recently retired grid.org looked at
the throughput use-case and took it to the extreme by simply taking a protein
crucial to the internal workings of cancer cells and running every single
possible potential drug target in the library against that protein. It took a
leap of imagination to dream up six years of running billions of drug targets
against multiple proteins.



The most rewarding moment during a consulting engagement is
when I see that users "get" the basic use-cases and start dreaming big. Can you
dream big?  What can the grid do for you?

No comments:

Post a Comment