Enemy of the State

Our industry is driven by some of the smartest people in the world. And these folks are not afraid to make mistakes and change their minds. This is a unique situation. Typically, the foundations of other scientific disciplines do not undergo constant upheaval. For example, Sir Isaac Newton defined our universe and the rules governing physics for over two centuries, until a new thinker named Albert Einstein revolutionized physics.

In the computer science domain, "absolute truths" have half-lives of about five years -- although thanks to the maintenance contract, no truth ever actually dies. For example, those of you who know object-oriented programming principles by heart understand that an object is supposed to encapsulate its data. From encapsulation, we infer that an object holds the data that makes it unique, and that the data has specific values that define the object’s current state. But like a band of Relativists rumbling with a hoard of Newtonians, the idea of programming objects with state, or stateful objects, is coming under attack.

State is quickly becoming a bad word in the distributed object world because of the premise that stateful objects do not scale well. Here's the abbreviated case against stateful distributed objects. If a distributed object has state, then once an object is instantiated it consumes server resources for its entire lifetime. The server cannot destroy the object when the server needs to give resources to another object or the state information would be lost. Load balancing between multiple servers is undermined because the client of a stateful object must always reconnect to the same server -- the only server that knows the state of that object. Ergo, distributed objects should not have state.

Stateless objects, on the other hand, can be destroyed as soon as a client request is completed. Because the object contains no state, the server can always reconstitute the object again if necessary. A more advanced approach is to create a pool of stateless objects on the server that wait to satisfy a client request. And since the objects are stateless, the client can be connected to any of the pooled objects and be assured of the same result. Because the objects are already created, the overhead of constructing and destroying objects is eliminated. The result is excellent scalability.

Excellent scalability didn’t matter as much in the past as it does today, when almost every business is attempting to Web-enable its business information assets. When a business offers Web access to its information, the transactional load on the database may easily increase by a few orders of magnitude. Web servers are responsible for much of the interest in stateless computing. The stateless nature of the Web is precisely what allowed it to grow and scale so effectively. If Web access followed a conversational model instead, where the connection between the client and server remained throughout a visit to a Web site, the Internet would be a huge logjam.

If you’ve never experienced stateless programming, you might understandably ask, "What good is an object without state?" In fact, stateless is a bit of a misnomer. Externalized state is a better description. The state of an object is stored outside of the server and can be injected into the stateless object as the client makes a request, temporarily giving the object the appearance of state. Stateless objects are like object shells, waiting for someone else -- usually the client or a database -- to provide the inside. A dynamic Web page is a great example of a stateless shell. That nasty URL, with all its embedded special characters, is simply a way to provide your current state to the Web server while requesting the next Web page, so the server can return a page that seems to be designed for you. In fact, it simply returned the same thing it will always return whenever that exact state information is provided as input. Other mechanisms, such as browser cookies, can also provide state to a Web server’s stateless shell.

For those of us who were taught to minimize function calls to improve performance, it seems counter-intuitive that the best performing distributed objects require not only a function call, but also one that transfers the entire state of the object each time. Certainly, it behooves us to keep this state package as small as possible. Another approach is to simply pass the ID of the object instance to the server and force the server to retrieve the state for that object from persistent storage. On the surface, this merely trades communication cost for retrieval cost, but passing by ID can enable the server to cache the most recently used object states for immediate retrieval.

Will objects with internalized state go the way of the Dodo? While some zealots compare stateful distributed objects to the spaghetti-code producing GOTO statement, state will continue to play a major role in client-side computing. Microsoft’s MTS supports both types of objects, but it recommends strongly that stateless objects be used for scalability. In my work, both Web and application server solutions require working with stateless servers, making me a recently christened enemy of the state. --Eric Binary Anderson is a development manager at PeopleSoft's PeopleTools division (Pleasanton, Calif.) and has his own consulting business, Binary Solutions. Contact him at ebinary@yahoo.com.