Technical and Scientific Computing with Grid Engine: Who Cares What's inside a Cloud?

By Roderick Flores

When I consider my microwave, telephone, or television I see fairly sophisticated applications that I simply plug into service providers and get useful results. If I choose to switch between individual service providers I can do so easily (assuming certain levels of deregulation of utility monopolies of course). Most importantly, while I understand how these appliances work, I would never want to build one myself. Yet I am not required to do so because the providers use standardized interfaces that appliance manufactures can easily offer: I buy my appliances as I might any other tool. Consequently, I can switch out the manufacturer or models for each of the services I use without interacting with the provider. I use these tools in a way that makes my work and life more efficient.

Nobody listens in on my conversations, nor do they receive services at my expense, I can use these services how I wish, and because of competition, I can expect an outstanding quality of service. At the end of the month, I get a bill from my providers for the services I used. These monetary costs are far outweighed by the convenience these services offer.

It is this sort of operational simplicity that motivated the first call for computational power as a utility in 1965. Like the electrical grid, a consumer would simply plug in their favorite application and use the compute power offered by a provider. Beginning in the 1990s, this effort centered around the concept of Grid computing.

Just like the early-days of electricity services, there were many issues with providing Grid computing. The very first offerings were proprietary or narrowly focused. The parallels with the electric industry are easily recognized. Some might provide street lighting whereas others would provide power for home lighting and still others for transportation and yet another group industrial applications. Moreover, each provider used different interfaces to get the power. Thus switching between providers, not a rare occurrence in a volatile industry, was no small undertaking. This, clearly was very costly for the consumer.

It took an entrepreneur to come to the industry and unify electrical services for all applications while also creating a standardized product (see http://www.eei.org/industry_issues/industry_overview_and_statistics/history for a quick overview). Similarly several visionaries had to step in and define what a Grid computer needed to do in order to create a widely consumable product. While these goals were largely met and several offerings became very successful, Grid computing never really became the firmly rooted utility-like service that we hoped for. Rather, it seems to have become an offering for specialized high-performance computing users.

This market is not the realm of service that I started thinking about early in this post. Take television service: this level of service is neither for a single viewer nor a small-business who might want to repackage a set of programs to its customers (say a sports bar). Rather it is for large-scale industries whose service requirements are unimaginable by all but a few people. I cannot even draw a parallel to television service. In telecommunication it would be the realm of a CLEC.

Furthermore, unlike my microwave, I am expected to customize my application to work well on a grid. I cannot simply plug it in and get better service than I can from my own PC. It would be the equivalent of choosing to reheat my food on my stove or building my own microwave. You see, my microwave, television service, and phone services are not just basic offerings of food preparation, entertainment, and communication. Instead, these are sophisticated systems that make my work and life easier. Grid computing, while very useful, does not simplify program implementation.

So in steps cloud computing: an emerging technology that seems to have significant overlap with grid computing while also providing simplifying services (something as a service). I may still have to assemble a microwave from pre-built pieces but everything is ready for me to use. I only have to add my personal touches to assemble a meal. It really isn't relevant whether the microwave is central to the task or just one piece of many.

When I approach a task that I hope to solve using a program, how might I plug that in just as easily? Let's quickly consider how services are provided for television. When I plug my application(TV) in to the electricity provider as well as a broadcaster of some sort, it just works. I can change the channel to the streams that I like. I can buy packages that provide me the best set of streams. In addition, some providers will offer me on-demand programming as well as internet and telephone services. If anything breaks, I call a number and they deal with it. None of this requires anything of me. I pay my bill and I get services.

Okay, how would that work for a computation? Say I want to find the inverse for a matrix. I would send out my data to the channel that inverted matrices the way I like them. The provider will worry about attaining the advertised performance, reliability, scalability, security, sustainability, device/location independence, tenancy, and capital expenditure: those characteristics of the cloud that I could not care less about. Additionally, the cloud properties that Rich Wellner assembled don't interest me much either. Certainly they may be differentiators, but the actual implementation is somebody else's problem in the same way that continuous electrical service provision is not my chief concern when I turn on the TV. What I want and will get is an inverse to the matrix I submitted in the time frame I requested deposited where I requested it to be put. I may use the inverted matrix to simultaneously solve for earthquake locations and earth properties or for material stresses and strains in a two-dimensional plate. That is my recipe and my problem.

After all, I should get services "without knowledge of, expertise with, or control over the technology infrastructure that supports them," as the cloud computing wiki page claims. Essentially the aforementioned cloud characteristics are directed towards service providers rather than to the non-expert consumer that highlights the wiki definition. Isn't the differentiator between the Cloud and the Grid the concealment of the complex infrastructure underneath? If the non-expert consumer is expected to worry about algorithm scalability, distributing data, starting and stopping resources and all of that, they certainly will need to gain some expertise quickly. Further, once they have that skill, why wouldn't they just use a mature Grid offering rather than deal with the non-standardized and chaotic clouds? Are these provider-specific characteristics not just a total rebranding of Grid?

As such, I suggest that several consumer-based characteristics should replace the rather inconsequential provider-internal ones that currently exist.

A cloud is characterized by services that:

use a specified algorithm to solve a particular problem;
can be purchased for one-time, infrequent use, or regular use;
state their peak, expected, and minimum performances;
state the expected response time;
can be queried for changes to expected response time;
support asynchronous messaging. A consumer must be able to discover when things are finished;
use standard, open, general-purpose protocols and interfaces (clearly);
have specified entry-points;
can interact with other cloud service providers. In particular, a service should be able to send output to long-term cloud-storage providers;

Now that sounds more like Computation-as-a-Service.

Technical and Scientific Computing with Grid Engine

Thursday, November 6, 2008

Who Cares What's inside a Cloud?

No comments:

Post a Comment