Justin Francis Self-Portrait

Saturday, December 22, 2007

Overlooked Developer Qualities

There is a lot more to a developer than ability to write code, or even to design software. I want to emphasize some non-technical qualities that do not normally get the recognition they deserve, but that I have noticed increase the value of a developer. They are, in no particular order:

  • Code Memory
  • Debugging Skills
  • Attention to Detail and Thoroughness

At my job, we have a very open, spontaneous environment, and developers will routinely raise their voice and ask a general question like "did anybody change this recently" or "what was that issue we had last week"? What amazes me is that not many developers remember how they designed something or how they solved a problem last week, let alone six months ago. It is a critical asset, therefore, to have a developer on the team with fantastic "code memory".

A developer with good code memory knows everything about how the system works, the current feature set, and current problems. In addition, they can remember how all those things have evolved over a period of months. This saves time when debugging recurring problems, answering questions from users and answering questions from developers. Every team should have this librarian-like keeper of knowledge, though ideally, this would be redundant across the entire team.

Another great quality to have as a developer is good debugging skills. To be able to quickly identify, isolate and fix problems is supremely valuable both during development of new features and during maintenance of a running system. There is nothing worse than having development slow to a crawl because you are plunging down a rabbit hole that may not be related to the problem at hand. On a running system, this skill is especially valuable as it means less downtime. Problem solving skills and code memory combine to vastly enhance this skill.

Finally, attention to detail and thoroughness make a big difference in the quality of a developer. Thia quality fundamentally allows a developer to be self-sufficient. Often, this skill is the difference between an intermediate developer and a senior developer. Without being able to think the entire feature through in all its detail and being sure that those details are covered by the solution, a developer cannot run projects, or even develop new features without support from someone who does have this quality.

These non-technical skills are based largely on fundamental learning abilities that ought to be taught to everyone starting in elementary school. These skills are not as easily quantifiable as languages known or coding ability, but deserve to be recognized for their indispensable value on a dev team.

Sunday, December 2, 2007

Velocity: It's not a race

I use an agile estimation process at work. During any given two week iteration, we estimate all tasks that need to be done and then we measure how much we did at the end of the iteration. These numbers will then be used to predict how much we can do for the next two week period. This sounds simple enough, but there is a common misconception that does not escape even the most experienced agile teams. I often hear comments like "is there a estimate for that?" when expanding the scope of a ticket, or "does that fall into this ticket"?

I guess we just cannot seem to help feeling we need to maximize how much estimated work we do during an iteration. To feel this way is to fundamentally misunderstand the purpose of estimating and measuring.

The key is that we are measuring how much estimated work is being done; not how much actual work is being done. So built into each estimate is a specific amount of uncertainty about what needs to be done to complete the ticket. Generally, tickets describe end-user functionality, and so the ticket includes whatever work is needed to get the feature done, including unknowns.

But there are other reasons besides unknowns for why the actual work differs from the estimated work. Perhaps one is not doing lots of estimated work. We only estimate features and bug fixes. There is a whole slew of other work that needs to be done on a dev team: project management, live system maintenance, refactoring, etc.

The bottom line is that as long as you are consistent about how you estimate, it does not quantitatively matter how many units of estimated effort you complete. In fact, you could double all estimates starting in the new year. All that matters is that you then measure how much you get done in an iteration. Nobody should be concerned about their velocity except insofar as it allows for the accurate prediction of how much they can tell stakeholders they can accomplish for the next iteration.

To change an estimate during an iteration because of actual work is to change your estimating strategy, which in the end will leave your estimates less consistent, and your planning less reliable.

Friday, November 16, 2007

Ordering work in an iteration

Recently, we have been in a velocity crunch at work. Our velocity plummeted 50% in a single iteration. The result of this (relevant to this post) is that we missed our target for the iteration by a big margin. Luckily, we were able to mitigate the impact and the stakeholders did not notice a huge discrepancy. This experience, did, however, get me thinking about the best way to order work in an iteration to minimize the risk of missing deadlines, and to minimize the impact on stakeholders when you do.

Ordering tactics to mitigate the risk of not meeting an iteration deadline:

  • Begin with the largest tickets

  • Begin with user-visible tickets

  • Begin with highest priority tickets

  • Begin with tickets whose component is least familiar

One should begin with the largest tickets because they are the ones that generally have the most unknowns in them. The estimate is the roughest, so the time they will take to complete is most uncertain. They are also more difficult, and so carry more risk.

By starting with user-visible tickets, the likelihood of having to push a stakeholder's ticket at the end of the iteration decreases. It is much easier politically to push a ticket that the stakeholder will not see for an iteration or two (or ever if it is some kind of internal ticket).

Starting with higher priority tickets reduces the risk that a critical ticket will need to be pushed at the end of the iteration because they have all been completed. Stakeholders are much more understanding when pushing a ticket that can wait an iteration than when pushing a critical bug fix.

Finally, if a developer has little experience with a certain component or domain (or the team if nobody has experience), they should not attempt to complete a ticket in that domain at the end of the iteration. It will take them longer than a developer with experience to do the same work. Having the developer make the attempt at the beginning of the iteration ensures extra time if needed, and extra help which is usually unavailable in the final two day sprint.

Following these guidelines has helped us to all but eliminate the need to push important features from an iteration due to lack of time, a welcome political victory for any agile development team.

Monday, October 29, 2007

QA is Expensive

Is it "obious day at camp stupid?" Maybe, but quality assurance is still expensive, and people (especially stakeholders) sometimes like to forget this fact. In this context I am using QA to refer to the final testing of a build as a whole. Our team does not have dedicated QA staff, so every two week iteration the entire team takes from one to two days to test the build. That is 10%-20% of the total effort of an iteration. Read that line again.

Stakeholders, however, are still very angry (understandably) when a bug makes it into the production system. On any given iteration, we usually have a patch after the fact that fixes something, though usually something minor. I bring it up because that is our strategy for making up for the deficiency in the QA effort: let the users test it.

It sounds horrible, but the best testers of any system are real users trying to use the system in the real world. They find bugs ridiculously fast. This might lead you to have the idea of users testing a preview of the release. This is a good idea, but does not work for business applications because there is a usually only a single instance of the production software running at a time.

Unfortunately, there really is no other alternative except to spend more money to test a build. Upper management is not going to fork over the money unless there really is a need to be 99% bug-free on delivery day. This is usually not the case unless you are shrink-wrapping. And let's face it, you're not.

If that is not enough to dissuade you, in addition to extra money, if you are looking at a dedicated QA staff, you will also have extra lag time between the finishing of a build and its delivery (you cannot test a build before it is finished, at least, not the QA I am talking about here). The QA staff must be handed the build, and the developer team must be handed back a list of bugs, at which point the process repeats. In the meantime, the team has moved on to a new build, and is no longer focused on the old one. So deliveries end up being delivered half-way through an iteration instead of on iteration boundaries.

I have found that if you patch any bugs the users do find (that are important, see my last post) in a reasonable time with a reasonable attitude ("thanks for reporting that", "must have slipped past our two days of testing"), the users will not mind. Instead they will worship the ground you walk on for reducing QA time and given them more effort to spend on new development. Pause not.

Friday, October 12, 2007

Fixing Bugs is not Free

When wandering the halls, I will often hear comments from users about little bugs (usually display bugs) and I tell them straight up that in all likelihood, the bug will never be fixed. The typical response to this is a gasp, followed by a smug look that means something along the lines of "I could write software better than these amateurs."

I have also told developers who report small bugs that "we'll wait for a user to report that," with similar results. I then have a choice to make. Try to convince them I actually know what I am doing, or leave them thinking I'm an buffoon. Here is the argument I make.

Fixing bugs is just like building new features. It is not free. Each bug users desire fixed costs effort (points in our agile methodology) to do so. Bugs are usually much cheaper to fix than new features are to build, but the cost is certainly not insignificant.

If bugs cost effort to fix just like anything else, then they must be estimated and scheduled just like everything else. This is where the key lies. When confronted with the option of refining an existing feature (let alone a bugfix) or the creation of a new feature, stakeholders will almost always opt to implement a new feature (this leads to a kind of functional but rough monolith, but that is another post). This means that bugs, especially ones that don't really hurt anybody, are the least likely items to get scheduled. And so they don't.

I should make a note about critical bugs. If a critical bug (one that has no workaround that prevents an important feature from working) is found, we fix it immediately (forget iterations), but even these are not free. After the fact, we estimate the fix and then push an appropriate number of items from the current iteration to make room for the bugfix, just as if a stakeholder has scheduled it.

Surprisingly, systems I have built using this strategy are not buggy as one would expect, though that probably has more to do with Test Driven Design than anything else. The point is that if you do things properly, this strategy not only works but works well. We hardly ever schedule bug-fixes at work, and when we do, they are usually almost as large as features.

Once this is explained, I wait for a few weeks and then circle back. The person in question is usually impressed with the features we have delivered in that time and is not concerned about that bug that they don't even notice anymore.

Sunday, September 23, 2007

Multi-Project Agile

We have just inherited a new project with existing codebase and developers. The team has grown and the size of the codebase we need to maintain has grown. We are, however, facing some questions about how to plan for both our primary project and this secondary project, which is being end-of-lifed. We are not talking about two major components of the same application; the two projects have different architectures, languages, stakeholders and developer expertise.

A few little problems before we begin. We decided to have the iterations for both projects operate on the same schedule to reduce problems with planning and tracking velocity (described below). We also separated the code repositories and project planning software instances.

The big problem is how do we apply agile planning to both of these projects with a single team? From what I can see there are two major ways to tackle the problem. The first is to divide the team in some way (by day, by developer, etc). The second is to ignore that the second project is different, and plan the team's time as a single entity.

There are a number of ways to divide the team. We could devote all of a developer's time. We could also rotate developers days or a week at a time. The second would be preferable because then all developers are exposed to both projects. This is essentially creating two teams, whether actual groups of developers, or simulated groups of developers (by dividing based on time). We would then have two iteration planning meetings; one for each block of time. The problem with this approach is that the stakeholders cannot control the amount of effort spent on each project. Because there are conceptually two teams with two velocities, they must plan them separately and make concessions separately.

Intuitively, however, I think the second option holds more promise. In this scenario, the extra project works just like a separate component of the existing project. The team has a certain velocity, which the stakeholders can apply to both projects as they feel is appropriate. This means a single iteration planning meeting with a single velocity with the union of all stakeholders to plan the team's iteration. The major problem with this is that it is dirty for developers and planners. Developers have more severe context-switching and planners will probably need tools that can do multi-project planning to track velocity at the team level instead of project level.

In the end, we have opted for option 2 because of the flexibility it adds. It will be rough for us internally until we can get things running smoothly. It is a crummy situation all around to have to maintain an assumed system. Planners and developers will hurt, but it is an experience everyone should go through (just like working on that crummy system) to drive home the principles put forth in agile methodologies and simply good programming practices.

Saturday, September 8, 2007

Is Database Version Control Worth It?

In this second post on the subject of automatic database upgrading, I discuss the merits of the system whose implementation is described in the previous post.

I won't be circumspect; it is not cheap to make auto-deployment work. It took one programmer two weeks to implement and we probably spend about one day a month maintaining the system. In the end, it comes down to one fundamental factor: the number of environments to which you deploy. But first, the pro and cons.

The biggest advantage of this system is that deployments can be fully automatic. They can be scheduled to happen at a specific time, and everything will get upgraded. No waiting on the DBA, or ensuring the changes are all correct and in the proper spot.

Similarly, the deployment of the new "database version" becomes as solid and consistent as the deployment of the code. The deployment of the database necessarily becomes part of the testing phase of an iteration. This means that deployments are more often successful because they are better controlled and better tested.

The one big disadvantage is complexity. There is a lot of complexity in maintaining a database version. I am not convinced, however, that this complexity is due to the automatic deployment. Rather, I think that the deployment merely exposes problems that would otherwise be hidden when deployment is done manually.

For example, the system may boot fine without running a particular rollback script, but the deployer will stop the boot because the upgrade will fail because the rollback was not run. This would be hidden in a manual deployment, but exposed during an automatic one.

But by far the biggest determining factor is the number of instances you need to deploy to. Just on the development side of things, we have a unittesting database, a deployment database (to unit-test the upgrade scripts), the full development database, the trimmed (lean data) development database and a number of personal development database instances. Then there are testing, staging and production databases.

If a developer makes a change to the database they must publish this change (run it) on each of those databases. If they don't, tests will begin to fail and servers will fail to start as others upgrade to the latest source which does not yet have the required database support. If the developer does not run them on all the databases, it is left to other developers to run the changes when they figure out the reason their servers will not boot.

With the automatic deployment, none of this is necessary. Upgrading to the latest version of the source will also upgrade any databases being used to the appropriate version.

For us, with only a half-dozen databases, it is worth it. It is worth it twice over. I never have to come in during the pre-dawn to launch a version, and I never have to tell other guys they need to run change scripts as they appear. My code changes and databases changes have been harmonized. They both work in the same way.

Everything just works. And that is the best compliment a user can give to a piece of software.

Friday, August 17, 2007

Database Version Control

This first post will provide an overview of the automatic upgrading and downgrading tool for the database we have built into our application at work. The next will reflect on the advantages and disadvantages of this approach.

I have already posted on the importance of versioning the database. This post describes the next step we took in the same vein which was the automatic running of the versioned SQL change scripts to upgrade the database from one version to the next.

This was not as easy as I would have thought.

Upgrading to a new version of software itself is pretty trivial; all you do is replace the static environment (the code) and reboot. The trickiness is upgrading the dynamic environment (the data). Previously, all of this was done manually by a Database Administrator during the launch.

The implementation of the auto-deployment provides three major functions: upgrade(), rollback() and redeploy(), one of which can be run on system startup to ensure the database and code version match.

Upgrade will take the latest change scripts from source control, run them, and then insert a new row into a version table in the database, along with the script it ran, as well as the rollback scripts that undo the upgrade.

The rollback scripts must be stored at upgrade time because when it comes to rollback, we need to rollback not what is currently under source control, but what was run when the last upgrade happened. In addition, if you only store the rollback scripts for the current version, when you rollback (see the deployer, below) you are running a previous version of the software, which does not have access to the future version's rollbacks under source control.

Rollback will simply run the rollback scripts stored in the version table for the latest version, then delete the row from the table.

Redeploy will run the rollback, then run the upgrade and re-insert the version row into the version table. This is extremely useful during development when new database changes are coming in over the course of an iteration, and a single upgrade is insufficient.

Which of the three functions gets run on server startup is up to the deployer(). The deployer checks the database version (stored in a table in the database) versus the code version (stored in __version__ for python).

If the code version is not in the version table, an upgrade must be done. If the code version is in the version table, but the current code version differs from the database version, a rollback must be run.

Finally, if the versions match, but the upgrade script stored in the database does not match the ones under source control, a redeploy is performed.

Each of these operations is wrapped in a transaction. The first problem we ran into was how to handle bad change scripts. In these cases, the upgrade would fail half-way through, and the database would not be upgraded, but it would not be in its original state either. We immediately wrapped all three of the operations in transactions to ensure this horribly messy, horribly frequent problem did not re-occur.

One of the major unsolved problems we still have is the problem of bad rollbacks. If a bad rollback is committed, but with a valid upgrade, the rollback script is inserted with the upgrade into the database version table successfully. Then when the rollback is actually tried later, it fails, and there is no way to fix it because the rollback is already in the database. Our workaround is to simply replace the rollback script directly in the database with the latest from source control.

The next post will come to a conclusion about whether all of this is worth it, and how much it really does cost.

Sunday, August 5, 2007


Introducing process into a company is always a slow and difficult process. I have been gradually introducing agile processes into my current company over the last two years. This week, however, I realised that the work will never be complete. In a sentence, there will always be unbelievers. There will always be those people who just don't like process; who simply cannot work with structure. They are usually the same people who do not prepare for meetings, who don't read long emails, and who like the idea that when they say things, others jot those things down and get them done right away. The good news is there are ways to handle these people.

First, convert as many people into believers as possible (whether from agnostics or otherwise). Early in the project, target a specific group or department. Then using the process, show them it works and show them how they can be involved and how that benefits everyone. The more believers you have, the easier it is to convince others of the merits of the process. I have found that these believers are often stronger advocates for structure than even myself. They see the way it used to work and they see how much better things are once they started using the process. They understand in a very concrete way how others' jobs may be improved by the same structure. Many of these believers even begin to take attacks on the process personally, and there is no better advocate than that (not that we would ever discourage criticism of our process).

Second, strictly enforce the process for the unbelievers. Ruthlessly enforce meeting agendas, only accept feature requests at the appropriate time. In other words force the unbelievers to use the process to have anything done at all. Once you see that they understand and may start using the process (or have admitted to violating the process), start relaxing the enforcement. Show them that things work easier when they at least try to work within the formal structure. Nobody likes a fascist development team, but it is critical that you force the unbelievers to start using the process, because if you don't, they will continue to circumvent it forever.

Finally, relax the process for believers. A methodology exists to enable, not to restrict. There are certain things that should rarely be compromised. Release only on iteration boundaries, is a good example. Yet we routinely release patches if they are important enough. The reason is practical; a patch represents little risk, yet great benefit. In addition, if you are impractical in your enforcement of the process, you may start losing believers. You make exceptions for believers because they know and love the process; it is just that in this case, it broke down for them, or they made a mistake. The point is they are not trying to undermine the structure and that means they are working with you.

At the end of the day, you are just looking for people who are, for the most part, working with you, working with your process, and helping you to deliver software. For this to happen, you need to deal with the unbelievers by punishing them, rewarding those who change, and keeping your true believers happy.

Saturday, July 21, 2007

Build vs Buy For Core Business Tools

Before entering the fray, I need to mention that the argument I put forth here is tailored to a very specific question that happens to have been evaluated a number of times at the company I work for. It relates to the question of whether to buy or build a solution to automate business processes of the company. Moreover, it assumes a competent development team that is currently available. Finally, I am discussing a build vs buy decision in a small company (a hundred or so employees). While this post was motivated by a specific build vs buy decision, I only lay out arguments that are generally applicable here.

Probably the biggest reason management likes the idea of buying software is that it is a quick fix that is available today; they do not have to wait for the solution to be built in-house. This is, however, not entirely correct. The last time we tried to adopt a pre-built solution, it was six months after the purchase date that the first user began to use the software. This is because even though the software is available right away, it takes time for people (including IT) to learn the new system, adapt processes to accommodate the new system (more on this later) and most importantly to trust the system so they abandon their old process.

Time to adoption may be long, and if it is also the case (as it often is) that only a small subsection of the built software is really needed by the business (like a ticketing system), it may be possible and easier on the company to have an agile team release the software slowly to allow the business to adapt instead of switching all at once to a pre-made solution. It may even turn out that the time to build with concurrent adoption is equal to the time to adopt the 3rd-party system.

This leads directly to the question of how much development work will be required in both cases. There is no way anyone can tell me that a pre-built solution will not have to be customized once it is purchased. In fact, a significant amount of customization has been done on every solution we have purchased. Because this software is foreign, this may mean buying customization from the vendor (with all the lack of control that entails) or if you are lucky, customization by your own development team. In the latter case, this customization is not as cheap as customization or general development of the same complexity on an in-house solution because the developers did not infuse the foreign software with their philosophy and quality requirements. And again, the customization cost of the new software may not be significantly different to the development cost of the subsection of functionality that is really required by the business in an in-house solution.

A major reason I am a proponent of agile methodologies is that the business I work for changes requirements almost weekly, depending on the department. This can cause major problems with a pre-built solution. It could even mean constant customization of someone else's product. The flexibility of pre-built solutions is definitely questionable. This means that more often than not, the business ends up adapting to the software, and not the other way around. This leads to the long adoption time I mentioned above. This is even more of an issue if the software relates to the core of the business because the usually over-generalised software is telling the company how they should do business (how to handle support calls, how to have new customers apply, how to pay sales agents, how to sell, etc).

There is also the cost of maintenance to consider. At between 10%-20% of the cost of the software per year, this is not insignificant. The same argument about customization given above applies to maintenance as well. Developers will be more efficient maintaining their own system than someone else's, if that is even a possibility. Sometimes, you are dependent on the vendor. Even assuming they are reliable, they may not be very responsive.

Finally, and perhaps most importantly, you may lose your development team by purchasing software. The best developers do not want to do maintenance; they want to do development. If they are maintaining a purchased solution, you better hope it is high quality and built well, in a modern language (did I just cut out 85% of off-the-shelf software?), because if not, you will have a hard time attracting good developers.

For us, it seemed a no-brainer. We would end up customizing the thing anyway, it would still take 6 months before it would be in use and maintenance would still be a problem. Considering that it may take about two months to rebuild the functionality required into our already-built enterprise management system, I cannot understand why anyone would consider buying an off-the-shelf solution. Yet if we had not reminded the executives of these considerations, I may have been working on a filthy perl application, and probably looking for a new job.

Saturday, July 14, 2007

Unit Test Pep Talk

The last two major features we developed at work were built largely without writing unit tests, let alone with a test-driven mentality. In fact, I would occasionally explicitly mention that unit tests should be written for such and such an issue, and it would turn out that the tests were not written. This led, predictably, to a much buggier release than normal, with patches being pushed to production every day for over a week.

Talking this over with my colleague, I felt that this was because while the whole team aknowledges that unit tests are theoretically a good idea, they feel that sometimes it is better to move faster and write less tests. So we decided to run down the reasons why our team unit tests using test-driven development in our iteration de-briefing. Here is what we said. Some of this will certainly not be novel, but bear in mind that these are the practical benefits that we see everyday that cause us to require unit tests in the software we write.

First, the obvious. Writing unit tests reduces bugs and speeds development by actually running your code quickly and checking it against the expected result automatically. I actually saw members of our team doing this manually: running the software and seeing if the code busted.

Probably the most important thing for me is that writing unit tests allow you to refactor safely. Without a solid suite of unit tests, it is impossible to go in to a module with a cleaver and be sure that you are not breaking anything.

Finally, and this was certainly not obvious to our team, the unit tests form a contract for what the code should be doing. If I am wondering what a method's behavior should be (as opposed to what it is), I go and look at the unit tests. They will tell me what is required of that method, and what is not required. I use this all the time for critical system configuration issues. For example, if a value must be 2 on system startup for the sytem to operate correctly, I add a unit test so that if ever anyone decides to change that value, the unit tests will tell them that is not allowed.

Moving on to test-driven development, we mentioned two reasons. The first and most important is that practicing test-driven development ensures adequate test-case coverage. By requiring a failing test before writing code, you ensure that every branch of your code is tested for a desired response. Similar to my first point about unit testing in general, this is simply the automation of something we do anyway: come up with the expected outcome of our code and then write the solution until we arrive at the expected outcome.

More subtly, test-driven development will improve the design of your code. Code that is tightly coupled, or not cohesive is much harder to test than the alternative. By writing tests first, you envision how you would like to use the library, instead of forcing yourself to use whatever interface you came up with after the fact. Because you want to isolate the code you are testing, and minimize the amount of test code you write, test-drive developement encourages a modular, re-usable design.

I feel that stressing the practical, immediate benefits of test-driven development is the best way to convince those who may believe, but do not know that automated testing makes their lives easier. It is so easy to write poor unit tests, and so hard to monitor, it is clear this is preferable to forcing the practice, even if it is official policy.

Friday, June 15, 2007

Velocity Crisis: Quality of Code

Recently, we have had a crisis of velocity at work. Our velocity has more than halved in the last six iterations. This post is not about that crisis; it is about one aspect of the introspection the crisis has called for.

We recently lost two not so short-term contractors on a team of seven. Initially, this was blamed for the velocity decrease. However, with a new hire, we still don't expect our velocity to get much higher than half the maximum of what it was with both of the contractors. I believe one of the major contributors to the current decrease is a previous decrease in the quality of our code. With the contractors, we were in a major push for new functionality, and it is clear that the quality of that code is such that now we have an increasingly difficult time adding new functionality. Not only were we not refactoring enough, but even the virgin code could have used a clean-up.

Which brings me to the subject of this post: sustainable development. Refactoring is the critical element in making sustainable development happen. It does require taking the long view, however. We reduce our velocity today, to ensure we can keep up the pace tomorrow. I used to believe in counting refactoring as part of velocity, but am now firmly opposed. The reason is that refactoring cannot be planned like functionality. It must be done at all times. It is not optional, and is not post-ponable. Around 90% of tickets created for refactorings that we did not have time to complete immediately have never been completed. They are either still waiting to be scheduled, or were included incidentally in a refactoring done that was required immediately.

One of the most difficult decisions to make when confronted with new functionality (especially if it has already been committed to the repository) is to reject the change and insist that a refactoring be done to make the change simpler and cleaner. It is clear, however, that this must be done to avoid a serious meltdown in velocity.

I am, of course, confirming what all the agile professionals have said all along. I knew that refactoring was important, but until now, I never considered it critical.

Sunday, June 3, 2007

Velocity Crisis: Intra-Iteration Work Distribution

Recently, we have had a crisis of velocity at work. Our velocity has more than halved in the last six iterations. This post is not about that crisis; it is about one aspect of the introspection the crisis has called for.

Of late, we found that we have been pushing massive amounts of tickets because we overestimated what we could get done. Ee suspected, however, that it was more than just optimism and a constantly decreasing velocity. We suspected that part of the problem was not recognizing early that we were overcommitted. We found ourselves too often in situations where work needed to be pushed to the next iteration, but that our options were limited because we were at the end of our iteration.

We began looking at the points completed per day of our iteration to get a handle on the lack of visibility into iteration progress. We found that a graph of our points completed per day looked like a ramp. This, of course, explained why we were pushing tickets so late in the iteration; most of the work was being completed at the end of the iteration. It was normal to be only 20% complete when we were more than halfway through an iteration, so there was no reason to worry. I do not know what the ideal curve would look like, but probably something like a bell. Certainly something more constant than a ramp.

Beyond making a concious effort to complete tickets earlier in the iteration or raise warnings to our colleagues if we are not, we have actually reorganized our iteration. We have increased the time for testing, which gives us an extra day of buffer room at the end of each iteration. We have moved stakeholder meetings to the end of the iteration. This way, the seniors are not completing tickets, but planning for the next iteration towards the end of the current iteration. Not only does this mean we cannot complete tickets at the end of the iteration, but it also means that as soon as the build is approved, developers (and critically, senior developers) can focus and start work on the next iteration.

This seems to be working. Our velocity has begun to increase again (thanks to the new hire, and more senior involvement in direct coding), and the curve is flattening out. It is now much easier to predict whether we will complete what we say we will, usually by the end of the first week of our two week iterations.

Thursday, May 31, 2007

Avoiding Storing Common One-To-Many Relationships

Occasionally, one comes across common classes of objects that you would like to be able to link to a number of different types of domain objects through a one-to-many relationship. For example, in a Customer Relationship Manager tool, you may want to have comments on many different types of entitities (customers, applications, disputes, etc). Same thing for tickets, attachments and other common structures.

Linking up these objects in the domain model is trivial: a list here, an ArrayList there. The real pain comes when you want to persist these relationships. Conventional wisdom holds you would create a link table for each of the entities you want to link to your common entity. The problem is that this is tedious and time consuming because not only do you have to create the tables, you also then need to modify your loading and saving procedures for the domain object. Because this is a one-many relationship, you will be getting multiple rows back per object, with all the muck that brings.

An alternative approach which we have begun to use at work would be to instead store the link to the domain object in the other table (the one side of the relationship). This way, there are no new tables to create. Of course, the immediate problem is that you cannot have a foreign key from one table to many tables. But imagine if instead of storing a database foreign key in the column, you instead store a "programming foreign key". In other words instead of a database link, store a programming link. In python, this would be the `repr` of an object. In this context, we could store a call to a finder function in this column, and evaluate it on the way out.

Storing a finder call in this way allows the common object to link to any number of different types of domain object. The domain objects just need to iterate over the collection of common objects to find those that link to themselves. When you add a new domain object, all you need to do is add a method that searches this extent. No database modification, no saving or loading code to manipulate.

Using some other tricks from python (though this is generally applicable), you could also use the `str` of the object to have a generic display of the common object's link to the domain object. If the protocol you are using (like python's `repr`) is generic enough, you could literally store anything in the column. If we take the comments example, the comment could have been left by a `User`, a string username or a list of `User`s.

There are downsides to this approach, of course. The biggest one is that ad-hoc queries on those common objects becomes more difficult. The good news is those common objects are not normally the subject of ad-hoc queries because they are not interesting domain objects. Additionally, if your system has the facility for ad-hoc queries in a programming language (no, not SQL), then this is a non-issue (depending on who is doing the querying). Secondly, you cannot enforce referential integrity in such a system at the database level. Again, though, I have not run into referential integrity issues on these kinds of common objects. Integrity is usually important for pieces of the domain that must fit together properly or system failure may occur. Finally, performance may also be impacted depending on how intense the finder function is, though these should be fast if you have designed your system well.

In the end, it is a trade-off between ease of addition of new domain objects and ease of querying and data integrity. For me, the database is just a storage system, so I will take advantage of the programming language I am working in, even if I lose some functionality at the data level.

Saturday, May 19, 2007

The Other Source Control

Writing software without source control these days is insanity. Having worked in professional environments both with and without source control, I know the pain of not having it, and the ease with which it can be introduced. Now, I never write anything that is not under source control, even if I am working alone. It is so free and the benefits so clear that I don't even question it anymore.

I think that almost everybody (except maybe Linus Torvalds) agrees that source control is critical to software development. For all its other shortcomings, the Joel Test has source control right at the top of the list.

What has strikes me, however, is how little concern is normally given to source control of the database. Especially if one is not using an In-Memory Model, your database is critical to the correct functioning of the code. Yet I rarely hear about shops that control change to the database in the same way they control changes to the code.

I'm not talking about changes to the data housed by the database (analogous to instance changes in the running code), but changes to the structure of the database. I can remember countless times where all of a sudden, our application would stop working. It turns out someone applied a database change directly on production. Other cases include asking "what changed in the database between version 1.1 and 1.2 of the system", "what do we need to do to the database when we release this version" and "what do I need to do to run a previous version of the software"? These questions are difficult to answer accurately without all change to the database being controlled just like all other changes to the code.

At work, we have spent a substantial amount of time working to bring the database in line with solid change control for almost a year. We are finally at the point where the code is able to detect the version of the database it is running on, and is able to either upgrade the version or roll it back to a previous version depending on what version of the code is currently running. The only changes that are applied are those that have been checked-in to the code repository as being required to run on the next release of the software.

We have found that controlling change in this way provides all the same benefits source control for our code provides. And in the end, doesn't this make sense? Doesn't it defy logic that part of our code should be controlled, while part of it (db schema) should not be? Now our database acts just like our code: changes are logged historically, changes are fully tested, changes are applied once per release on production, and the only changes made are those that are fully automated.

Like code source control, I will never look back.

Saturday, May 5, 2007

Love Your self In Python

When I was using Java, I always insisted on making the this reference explicit when referring to an attribute or method within the same class. I think it always makes the code clearer because there is never any doubt about where an identifier is coming from; either it is a temp or argument or there will be a this in front of it, in which case it is an attribute or method.

The habit of placing this in Java code even where it was not required certainly eased my transition to Python. In Python, self is the explicit first parameter to every method, and must be used whenever accessing methods or attributes of the object. I have often heard people complain about this, but it is one of the things I love about Python.

In addition to the clarity described above, the explicit self as the first parameter to all methods unifies functions and methods in a way that is not done in other languages. Once you start thinking about methods as functions that take an object as their first argument, functions and methods become very similar. This encourages a dual interface that is used often in Python. This dual interface is one where you can use either a function or an object's method to do the same thing (like Python's re module).

For example, I find myself often writing code like this:

def get_table_sql(name, db):
return 'CREATE TABLE %s (%s)' % (name, get_column_sql(name))

class Table(DbObject):
def get_sql(self):
return get_table_sql(self.name, self.db)

Now, nothing prevents you from doing this in other languages, but I find that the explicit self goes a long way towards encouraging programmers to think about functions and methods in similar ways. And at the end of the day, your design is based more upon your frame of mind than the technological limitations of the language you are using.

Sunday, April 29, 2007

Open Sourcing Software Incidental to a Business

Maybe my search technique is slipping, or maybe search engines are losing the battle against clutter, but in researching this post, I was hard-pressed to find any discussions that directly weigh in on how to release as open source incidental tools that were built on company time while working to build the software the business needs. Here are my own thoughts on this process.

A while ago we pitched our non-technical boss to release some tools we had built in-house in the process of building the software that supports the business. We are not a software development house; we build software to support the business in what they really sell: credit card processing. This software would never be a candidate for open source. It is customized for the company, highly specialized to our partners and is one of our value-added services that enable the business to sell their product. Therefore, even though we would never sell the software (we are not in that business), the business does want to deny its use to competitors, a sentiment I support.

There are, however, components of our software to which the preceding statement does not apply. Our WSGI server we built from scratch (because we did not like the other python web frameworks out there) or our mocking and other testing tools we built that did not exist for python already are good examples. Our argument for releasing these tools was a fairly standard one: higher quality from more users testing the software, reduced maintenance effort due to contributed patches and new functionality built by external developers (never mind giving back to the community). Unsurprisingly, the response was a decidedly chilly "let's continue discussing it".

I think part of the reason for the less-than-enthusiastic response is that it is difficult to explain to a non-technical manager the difference between general, re-usable components of a system and the core system itself. In such a scenario, all the manager sees is risk, and they are unable to easily see the benefit. I don't think this can ever be easily overcome. This is why I believe the decision to open source software not directly related to the core business domain must be the decision of the development team. Nobody else is qualified to weigh the pros and cons.

Which brings me to the main focus of this post. The primary and direct beneficiaries of open source software incidental to a business are software developers. This is especially true in this case, where we are talking about programming tools and frameworks, but holds true even for things like web servers and accounting software (because we do not have to re-invent the wheel). If we do not advocate and push to have the software open sourced, nobody else will because nobody else is able to see the cost of having to re-implement these things at each company.

Fundamentally, we all have an extremely selfish motive for open sourcing these tools. Most programmers are lazy and we don't want to do the same thing over and over again. The tools I build at work are useful to me in my personal projects, and I want to use them. But if I can't get authorization to release the software, what can we do about this? I have a second-best solution.

Even if we all crassly sell away our copyrights as software developers 8 hours a day for a salary, the company does not own the idea of what we have done (at least not yet). We can re-build the tools on our own time so that we can make the decision to open source them. This is what my colleague Iain Lowe dubs a "clean room implementation". There may even be benefits to re-implementation. The idea has been proven, the tool is clearly useful, and lessons have been learned. Note that you cannot do this in all cases. It depends on how close to the business' domain the component is and your employment contract with the company.

So in a worse-case scenario, we will do the same thing twice, but never thrice. Personally, I prefer this technique because it yields a higher quality piece of software. But more importantly in our open source gift economy, I want to be recognized for the work I have done beyond the shallow salary I earn. Isn't that why we all have project lists prominently displayed on our home pages? Here's to fattening them up.

Monday, April 23, 2007

Pile of Scripts vs Application Server

When I first started doing some work with Java servlets, what struck me immediately was that the entire web application has a shared state that is wholly contained within itself. This differs from say, a pile of PHP scripts, which do not share any state except what they store in the database (ignoring sessions). More recently, I moved into Python where both were possibilities, but we chose to have a single Python process handle all incoming web requests, yielding an application server instead of a pile of scripts.

This was probably the most fundamental complaint I had with many scripting languages, but to my surprise, I could not find a single online discussion of this issue. So to break the ice, I believe there are four reasons why an "In-Memory" model makes things easier for developers: lower impedance mismatch, native messaging technique, DB as secondary system, and theoretical performance.

Assuming you are using a Domain Model, it is much more natural to assume "Identity Integrity" in your application; by which I mean if we have an object that represents a specific user in our application, it is much easier to understand the system as dealing the same single object instance instead of dealing with multiple copies that get synchronized with each other through the database on each request.

If you don't have any shared memory native to your application that persists between requests, then fundamentally, you are sending messages through the next best thing: the database. It seems that it is much more efficient and powerful to send messages using the native framework without requiring that message to pass through the database, with all the restrictions that may imply. While rather abstract, this difference may have more of an impact than you think as I describe next.

In an application server with a single state kept resident in the running instance itself, the database takes on very little significance. It literally becomes a secondary system that you use only to ensure that the next time the system comes up, it will be in roughly the state it was in when it went down. In other words, it is used for persistence only. This is important because it alters your perspective, which may alter the way you design your system.

Last, and probably least (as this is by no means the only factor, or even the most important factor) is the fact that if you are hitting the database to retrieve state on each request, you are doing way more work (and so is your database) than you need to. Who knows how much faster your system would run if instead of reading from a DB each time a page loads, it just looks up a username in an array in memory?

I guess the fundamental difference between these two is that with a pile of scripts, the DB is used to maintain application state as well as persist that state, whereas with a single-process, in-memory application, the DB is only used to persist changes to the application state which is just "there". I think you would be surprised at how liberating that concept is.