Justin Francis Self-Portrait

Friday, November 16, 2007

Ordering work in an iteration

Recently, we have been in a velocity crunch at work. Our velocity plummeted 50% in a single iteration. The result of this (relevant to this post) is that we missed our target for the iteration by a big margin. Luckily, we were able to mitigate the impact and the stakeholders did not notice a huge discrepancy. This experience, did, however, get me thinking about the best way to order work in an iteration to minimize the risk of missing deadlines, and to minimize the impact on stakeholders when you do.

Ordering tactics to mitigate the risk of not meeting an iteration deadline:


  • Begin with the largest tickets

  • Begin with user-visible tickets

  • Begin with highest priority tickets

  • Begin with tickets whose component is least familiar



One should begin with the largest tickets because they are the ones that generally have the most unknowns in them. The estimate is the roughest, so the time they will take to complete is most uncertain. They are also more difficult, and so carry more risk.

By starting with user-visible tickets, the likelihood of having to push a stakeholder's ticket at the end of the iteration decreases. It is much easier politically to push a ticket that the stakeholder will not see for an iteration or two (or ever if it is some kind of internal ticket).

Starting with higher priority tickets reduces the risk that a critical ticket will need to be pushed at the end of the iteration because they have all been completed. Stakeholders are much more understanding when pushing a ticket that can wait an iteration than when pushing a critical bug fix.

Finally, if a developer has little experience with a certain component or domain (or the team if nobody has experience), they should not attempt to complete a ticket in that domain at the end of the iteration. It will take them longer than a developer with experience to do the same work. Having the developer make the attempt at the beginning of the iteration ensures extra time if needed, and extra help which is usually unavailable in the final two day sprint.

Following these guidelines has helped us to all but eliminate the need to push important features from an iteration due to lack of time, a welcome political victory for any agile development team.

Monday, October 29, 2007

QA is Expensive

Is it "obious day at camp stupid?" Maybe, but quality assurance is still expensive, and people (especially stakeholders) sometimes like to forget this fact. In this context I am using QA to refer to the final testing of a build as a whole. Our team does not have dedicated QA staff, so every two week iteration the entire team takes from one to two days to test the build. That is 10%-20% of the total effort of an iteration. Read that line again.

Stakeholders, however, are still very angry (understandably) when a bug makes it into the production system. On any given iteration, we usually have a patch after the fact that fixes something, though usually something minor. I bring it up because that is our strategy for making up for the deficiency in the QA effort: let the users test it.

It sounds horrible, but the best testers of any system are real users trying to use the system in the real world. They find bugs ridiculously fast. This might lead you to have the idea of users testing a preview of the release. This is a good idea, but does not work for business applications because there is a usually only a single instance of the production software running at a time.

Unfortunately, there really is no other alternative except to spend more money to test a build. Upper management is not going to fork over the money unless there really is a need to be 99% bug-free on delivery day. This is usually not the case unless you are shrink-wrapping. And let's face it, you're not.

If that is not enough to dissuade you, in addition to extra money, if you are looking at a dedicated QA staff, you will also have extra lag time between the finishing of a build and its delivery (you cannot test a build before it is finished, at least, not the QA I am talking about here). The QA staff must be handed the build, and the developer team must be handed back a list of bugs, at which point the process repeats. In the meantime, the team has moved on to a new build, and is no longer focused on the old one. So deliveries end up being delivered half-way through an iteration instead of on iteration boundaries.

I have found that if you patch any bugs the users do find (that are important, see my last post) in a reasonable time with a reasonable attitude ("thanks for reporting that", "must have slipped past our two days of testing"), the users will not mind. Instead they will worship the ground you walk on for reducing QA time and given them more effort to spend on new development. Pause not.

Friday, October 12, 2007

Fixing Bugs is not Free

When wandering the halls, I will often hear comments from users about little bugs (usually display bugs) and I tell them straight up that in all likelihood, the bug will never be fixed. The typical response to this is a gasp, followed by a smug look that means something along the lines of "I could write software better than these amateurs."

I have also told developers who report small bugs that "we'll wait for a user to report that," with similar results. I then have a choice to make. Try to convince them I actually know what I am doing, or leave them thinking I'm an buffoon. Here is the argument I make.

Fixing bugs is just like building new features. It is not free. Each bug users desire fixed costs effort (points in our agile methodology) to do so. Bugs are usually much cheaper to fix than new features are to build, but the cost is certainly not insignificant.

If bugs cost effort to fix just like anything else, then they must be estimated and scheduled just like everything else. This is where the key lies. When confronted with the option of refining an existing feature (let alone a bugfix) or the creation of a new feature, stakeholders will almost always opt to implement a new feature (this leads to a kind of functional but rough monolith, but that is another post). This means that bugs, especially ones that don't really hurt anybody, are the least likely items to get scheduled. And so they don't.

I should make a note about critical bugs. If a critical bug (one that has no workaround that prevents an important feature from working) is found, we fix it immediately (forget iterations), but even these are not free. After the fact, we estimate the fix and then push an appropriate number of items from the current iteration to make room for the bugfix, just as if a stakeholder has scheduled it.

Surprisingly, systems I have built using this strategy are not buggy as one would expect, though that probably has more to do with Test Driven Design than anything else. The point is that if you do things properly, this strategy not only works but works well. We hardly ever schedule bug-fixes at work, and when we do, they are usually almost as large as features.

Once this is explained, I wait for a few weeks and then circle back. The person in question is usually impressed with the features we have delivered in that time and is not concerned about that bug that they don't even notice anymore.

Sunday, September 23, 2007

Multi-Project Agile

We have just inherited a new project with existing codebase and developers. The team has grown and the size of the codebase we need to maintain has grown. We are, however, facing some questions about how to plan for both our primary project and this secondary project, which is being end-of-lifed. We are not talking about two major components of the same application; the two projects have different architectures, languages, stakeholders and developer expertise.

A few little problems before we begin. We decided to have the iterations for both projects operate on the same schedule to reduce problems with planning and tracking velocity (described below). We also separated the code repositories and project planning software instances.

The big problem is how do we apply agile planning to both of these projects with a single team? From what I can see there are two major ways to tackle the problem. The first is to divide the team in some way (by day, by developer, etc). The second is to ignore that the second project is different, and plan the team's time as a single entity.

There are a number of ways to divide the team. We could devote all of a developer's time. We could also rotate developers days or a week at a time. The second would be preferable because then all developers are exposed to both projects. This is essentially creating two teams, whether actual groups of developers, or simulated groups of developers (by dividing based on time). We would then have two iteration planning meetings; one for each block of time. The problem with this approach is that the stakeholders cannot control the amount of effort spent on each project. Because there are conceptually two teams with two velocities, they must plan them separately and make concessions separately.

Intuitively, however, I think the second option holds more promise. In this scenario, the extra project works just like a separate component of the existing project. The team has a certain velocity, which the stakeholders can apply to both projects as they feel is appropriate. This means a single iteration planning meeting with a single velocity with the union of all stakeholders to plan the team's iteration. The major problem with this is that it is dirty for developers and planners. Developers have more severe context-switching and planners will probably need tools that can do multi-project planning to track velocity at the team level instead of project level.

In the end, we have opted for option 2 because of the flexibility it adds. It will be rough for us internally until we can get things running smoothly. It is a crummy situation all around to have to maintain an assumed system. Planners and developers will hurt, but it is an experience everyone should go through (just like working on that crummy system) to drive home the principles put forth in agile methodologies and simply good programming practices.

Saturday, September 8, 2007

Is Database Version Control Worth It?

In this second post on the subject of automatic database upgrading, I discuss the merits of the system whose implementation is described in the previous post.

I won't be circumspect; it is not cheap to make auto-deployment work. It took one programmer two weeks to implement and we probably spend about one day a month maintaining the system. In the end, it comes down to one fundamental factor: the number of environments to which you deploy. But first, the pro and cons.

The biggest advantage of this system is that deployments can be fully automatic. They can be scheduled to happen at a specific time, and everything will get upgraded. No waiting on the DBA, or ensuring the changes are all correct and in the proper spot.

Similarly, the deployment of the new "database version" becomes as solid and consistent as the deployment of the code. The deployment of the database necessarily becomes part of the testing phase of an iteration. This means that deployments are more often successful because they are better controlled and better tested.

The one big disadvantage is complexity. There is a lot of complexity in maintaining a database version. I am not convinced, however, that this complexity is due to the automatic deployment. Rather, I think that the deployment merely exposes problems that would otherwise be hidden when deployment is done manually.

For example, the system may boot fine without running a particular rollback script, but the deployer will stop the boot because the upgrade will fail because the rollback was not run. This would be hidden in a manual deployment, but exposed during an automatic one.

But by far the biggest determining factor is the number of instances you need to deploy to. Just on the development side of things, we have a unittesting database, a deployment database (to unit-test the upgrade scripts), the full development database, the trimmed (lean data) development database and a number of personal development database instances. Then there are testing, staging and production databases.

If a developer makes a change to the database they must publish this change (run it) on each of those databases. If they don't, tests will begin to fail and servers will fail to start as others upgrade to the latest source which does not yet have the required database support. If the developer does not run them on all the databases, it is left to other developers to run the changes when they figure out the reason their servers will not boot.

With the automatic deployment, none of this is necessary. Upgrading to the latest version of the source will also upgrade any databases being used to the appropriate version.

For us, with only a half-dozen databases, it is worth it. It is worth it twice over. I never have to come in during the pre-dawn to launch a version, and I never have to tell other guys they need to run change scripts as they appear. My code changes and databases changes have been harmonized. They both work in the same way.

Everything just works. And that is the best compliment a user can give to a piece of software.

Friday, August 17, 2007

Database Version Control

This first post will provide an overview of the automatic upgrading and downgrading tool for the database we have built into our application at work. The next will reflect on the advantages and disadvantages of this approach.

I have already posted on the importance of versioning the database. This post describes the next step we took in the same vein which was the automatic running of the versioned SQL change scripts to upgrade the database from one version to the next.

This was not as easy as I would have thought.

Upgrading to a new version of software itself is pretty trivial; all you do is replace the static environment (the code) and reboot. The trickiness is upgrading the dynamic environment (the data). Previously, all of this was done manually by a Database Administrator during the launch.

The implementation of the auto-deployment provides three major functions: upgrade(), rollback() and redeploy(), one of which can be run on system startup to ensure the database and code version match.

Upgrade will take the latest change scripts from source control, run them, and then insert a new row into a version table in the database, along with the script it ran, as well as the rollback scripts that undo the upgrade.

The rollback scripts must be stored at upgrade time because when it comes to rollback, we need to rollback not what is currently under source control, but what was run when the last upgrade happened. In addition, if you only store the rollback scripts for the current version, when you rollback (see the deployer, below) you are running a previous version of the software, which does not have access to the future version's rollbacks under source control.

Rollback will simply run the rollback scripts stored in the version table for the latest version, then delete the row from the table.

Redeploy will run the rollback, then run the upgrade and re-insert the version row into the version table. This is extremely useful during development when new database changes are coming in over the course of an iteration, and a single upgrade is insufficient.

Which of the three functions gets run on server startup is up to the deployer(). The deployer checks the database version (stored in a table in the database) versus the code version (stored in __version__ for python).

If the code version is not in the version table, an upgrade must be done. If the code version is in the version table, but the current code version differs from the database version, a rollback must be run.

Finally, if the versions match, but the upgrade script stored in the database does not match the ones under source control, a redeploy is performed.

Each of these operations is wrapped in a transaction. The first problem we ran into was how to handle bad change scripts. In these cases, the upgrade would fail half-way through, and the database would not be upgraded, but it would not be in its original state either. We immediately wrapped all three of the operations in transactions to ensure this horribly messy, horribly frequent problem did not re-occur.

One of the major unsolved problems we still have is the problem of bad rollbacks. If a bad rollback is committed, but with a valid upgrade, the rollback script is inserted with the upgrade into the database version table successfully. Then when the rollback is actually tried later, it fails, and there is no way to fix it because the rollback is already in the database. Our workaround is to simply replace the rollback script directly in the database with the latest from source control.

The next post will come to a conclusion about whether all of this is worth it, and how much it really does cost.

Sunday, August 5, 2007

Unbelievers

Introducing process into a company is always a slow and difficult process. I have been gradually introducing agile processes into my current company over the last two years. This week, however, I realised that the work will never be complete. In a sentence, there will always be unbelievers. There will always be those people who just don't like process; who simply cannot work with structure. They are usually the same people who do not prepare for meetings, who don't read long emails, and who like the idea that when they say things, others jot those things down and get them done right away. The good news is there are ways to handle these people.

First, convert as many people into believers as possible (whether from agnostics or otherwise). Early in the project, target a specific group or department. Then using the process, show them it works and show them how they can be involved and how that benefits everyone. The more believers you have, the easier it is to convince others of the merits of the process. I have found that these believers are often stronger advocates for structure than even myself. They see the way it used to work and they see how much better things are once they started using the process. They understand in a very concrete way how others' jobs may be improved by the same structure. Many of these believers even begin to take attacks on the process personally, and there is no better advocate than that (not that we would ever discourage criticism of our process).

Second, strictly enforce the process for the unbelievers. Ruthlessly enforce meeting agendas, only accept feature requests at the appropriate time. In other words force the unbelievers to use the process to have anything done at all. Once you see that they understand and may start using the process (or have admitted to violating the process), start relaxing the enforcement. Show them that things work easier when they at least try to work within the formal structure. Nobody likes a fascist development team, but it is critical that you force the unbelievers to start using the process, because if you don't, they will continue to circumvent it forever.

Finally, relax the process for believers. A methodology exists to enable, not to restrict. There are certain things that should rarely be compromised. Release only on iteration boundaries, is a good example. Yet we routinely release patches if they are important enough. The reason is practical; a patch represents little risk, yet great benefit. In addition, if you are impractical in your enforcement of the process, you may start losing believers. You make exceptions for believers because they know and love the process; it is just that in this case, it broke down for them, or they made a mistake. The point is they are not trying to undermine the structure and that means they are working with you.

At the end of the day, you are just looking for people who are, for the most part, working with you, working with your process, and helping you to deliver software. For this to happen, you need to deal with the unbelievers by punishing them, rewarding those who change, and keeping your true believers happy.