Justin Francis Self-Portrait

Friday, August 17, 2007

Database Version Control

This first post will provide an overview of the automatic upgrading and downgrading tool for the database we have built into our application at work. The next will reflect on the advantages and disadvantages of this approach.

I have already posted on the importance of versioning the database. This post describes the next step we took in the same vein which was the automatic running of the versioned SQL change scripts to upgrade the database from one version to the next.

This was not as easy as I would have thought.

Upgrading to a new version of software itself is pretty trivial; all you do is replace the static environment (the code) and reboot. The trickiness is upgrading the dynamic environment (the data). Previously, all of this was done manually by a Database Administrator during the launch.

The implementation of the auto-deployment provides three major functions: upgrade(), rollback() and redeploy(), one of which can be run on system startup to ensure the database and code version match.

Upgrade will take the latest change scripts from source control, run them, and then insert a new row into a version table in the database, along with the script it ran, as well as the rollback scripts that undo the upgrade.

The rollback scripts must be stored at upgrade time because when it comes to rollback, we need to rollback not what is currently under source control, but what was run when the last upgrade happened. In addition, if you only store the rollback scripts for the current version, when you rollback (see the deployer, below) you are running a previous version of the software, which does not have access to the future version's rollbacks under source control.

Rollback will simply run the rollback scripts stored in the version table for the latest version, then delete the row from the table.

Redeploy will run the rollback, then run the upgrade and re-insert the version row into the version table. This is extremely useful during development when new database changes are coming in over the course of an iteration, and a single upgrade is insufficient.

Which of the three functions gets run on server startup is up to the deployer(). The deployer checks the database version (stored in a table in the database) versus the code version (stored in __version__ for python).

If the code version is not in the version table, an upgrade must be done. If the code version is in the version table, but the current code version differs from the database version, a rollback must be run.

Finally, if the versions match, but the upgrade script stored in the database does not match the ones under source control, a redeploy is performed.

Each of these operations is wrapped in a transaction. The first problem we ran into was how to handle bad change scripts. In these cases, the upgrade would fail half-way through, and the database would not be upgraded, but it would not be in its original state either. We immediately wrapped all three of the operations in transactions to ensure this horribly messy, horribly frequent problem did not re-occur.

One of the major unsolved problems we still have is the problem of bad rollbacks. If a bad rollback is committed, but with a valid upgrade, the rollback script is inserted with the upgrade into the database version table successfully. Then when the rollback is actually tried later, it fails, and there is no way to fix it because the rollback is already in the database. Our workaround is to simply replace the rollback script directly in the database with the latest from source control.

The next post will come to a conclusion about whether all of this is worth it, and how much it really does cost.

Sunday, August 5, 2007

Unbelievers

Introducing process into a company is always a slow and difficult process. I have been gradually introducing agile processes into my current company over the last two years. This week, however, I realised that the work will never be complete. In a sentence, there will always be unbelievers. There will always be those people who just don't like process; who simply cannot work with structure. They are usually the same people who do not prepare for meetings, who don't read long emails, and who like the idea that when they say things, others jot those things down and get them done right away. The good news is there are ways to handle these people.

First, convert as many people into believers as possible (whether from agnostics or otherwise). Early in the project, target a specific group or department. Then using the process, show them it works and show them how they can be involved and how that benefits everyone. The more believers you have, the easier it is to convince others of the merits of the process. I have found that these believers are often stronger advocates for structure than even myself. They see the way it used to work and they see how much better things are once they started using the process. They understand in a very concrete way how others' jobs may be improved by the same structure. Many of these believers even begin to take attacks on the process personally, and there is no better advocate than that (not that we would ever discourage criticism of our process).

Second, strictly enforce the process for the unbelievers. Ruthlessly enforce meeting agendas, only accept feature requests at the appropriate time. In other words force the unbelievers to use the process to have anything done at all. Once you see that they understand and may start using the process (or have admitted to violating the process), start relaxing the enforcement. Show them that things work easier when they at least try to work within the formal structure. Nobody likes a fascist development team, but it is critical that you force the unbelievers to start using the process, because if you don't, they will continue to circumvent it forever.

Finally, relax the process for believers. A methodology exists to enable, not to restrict. There are certain things that should rarely be compromised. Release only on iteration boundaries, is a good example. Yet we routinely release patches if they are important enough. The reason is practical; a patch represents little risk, yet great benefit. In addition, if you are impractical in your enforcement of the process, you may start losing believers. You make exceptions for believers because they know and love the process; it is just that in this case, it broke down for them, or they made a mistake. The point is they are not trying to undermine the structure and that means they are working with you.

At the end of the day, you are just looking for people who are, for the most part, working with you, working with your process, and helping you to deliver software. For this to happen, you need to deal with the unbelievers by punishing them, rewarding those who change, and keeping your true believers happy.