Justin Francis Self-Portrait

Monday, April 23, 2007

Pile of Scripts vs Application Server

When I first started doing some work with Java servlets, what struck me immediately was that the entire web application has a shared state that is wholly contained within itself. This differs from say, a pile of PHP scripts, which do not share any state except what they store in the database (ignoring sessions). More recently, I moved into Python where both were possibilities, but we chose to have a single Python process handle all incoming web requests, yielding an application server instead of a pile of scripts.

This was probably the most fundamental complaint I had with many scripting languages, but to my surprise, I could not find a single online discussion of this issue. So to break the ice, I believe there are four reasons why an "In-Memory" model makes things easier for developers: lower impedance mismatch, native messaging technique, DB as secondary system, and theoretical performance.

Assuming you are using a Domain Model, it is much more natural to assume "Identity Integrity" in your application; by which I mean if we have an object that represents a specific user in our application, it is much easier to understand the system as dealing the same single object instance instead of dealing with multiple copies that get synchronized with each other through the database on each request.

If you don't have any shared memory native to your application that persists between requests, then fundamentally, you are sending messages through the next best thing: the database. It seems that it is much more efficient and powerful to send messages using the native framework without requiring that message to pass through the database, with all the restrictions that may imply. While rather abstract, this difference may have more of an impact than you think as I describe next.

In an application server with a single state kept resident in the running instance itself, the database takes on very little significance. It literally becomes a secondary system that you use only to ensure that the next time the system comes up, it will be in roughly the state it was in when it went down. In other words, it is used for persistence only. This is important because it alters your perspective, which may alter the way you design your system.

Last, and probably least (as this is by no means the only factor, or even the most important factor) is the fact that if you are hitting the database to retrieve state on each request, you are doing way more work (and so is your database) than you need to. Who knows how much faster your system would run if instead of reading from a DB each time a page loads, it just looks up a username in an array in memory?

I guess the fundamental difference between these two is that with a pile of scripts, the DB is used to maintain application state as well as persist that state, whereas with a single-process, in-memory application, the DB is only used to persist changes to the application state which is just "there". I think you would be surprised at how liberating that concept is.

1 comment:

Anonymous said...

Saw your site bookmarked on Reddit.I love your site and marketing strategy.Your site is very useful for me .I bookmarked your site!
I am been engaged 5 years on the and loans If you have some questions, please get in touch with me.