We’re probably all familiar with the term “technical debt”, meaning the cost of doing things in a non-optimal or non-quality way. While I can go on at length (as my colleagues can attest!) about how this is avoidable by baking in quality, and thus saving time and money at every turn, the fact is that many existing projects have considerable technical debt.
Setting aside for the moment the discussion of telling “good” technical debt from “bad” technical debt, let’s just focus on a projects “bad” technical debt.
If we describe this kind of debt as factors that slow down the ability to change and improve the system, then we see that we are paying the “interest” on this debt every time we touch the codebase or it’s deployed instances.
What I’d like to focus on in this posting is the point at which this technical debt is evaluated to be sufficient that it makes sense to do what we’d normally call a “re-write” of the offending system or subsystem. Basically, when the -ah- mud gets so deep that the hip-waders aren’t helping, it might be time to throw in the towel and start a do-over.
I call this point “technical debt bankruptcy”. Much like a real bankruptcy, it’s an admission that chipping away at the debt isn’t going to work and isn’t worth it – that it’s time to re-group, and in a chapter-11-ish way, fold our tent in a responsible fashion.
Of course, determining if you’re at this point is critical. If you’re not, then you might be throwing out the baby with the bathwater and losing valuable work and former effort for reasons that are not sufficient. Often, political reasons can get us into that kind of situation, where the pressure to do a re-write is not justified. If there are no or few changes to a system, and it’s working sufficiently well as is, then there may be no reason to declare bankruptcy.
If, however, the bill collectors of technical debt are knocking down the virtual door, it’s important to know when to make the right move. As the song says “know when to fold ‘em”.
Part of knowing this is to be able to measure the pace and cost of change, and to be able to estimate, or better, measure, the cost of change if a re-write were done. Let’s say you’ve got a legacy project and a new project. The legacy project is using some old technology and techniques that are painful to work with, and you know they’re causing you to burn more time than they should be. If you also have a newer project (maybe something nice and greenfield) that’s being done with the latest new and shiny tools, Agile techniques, and so forth, you can get a rough idea of what each feature point in the new project is costing you. Now you can compare this to the cost of a feature point in the old project and make a comparison.
If you can look at your backlog of epics and get an idea as to what the future cost of maintenance on the legacy project is going to cost you, then you can take this cost and estimate it instead using the ruler of the new technologies and techniques. The delta is the amount you’ve got available to “spend” on a re-write, essentially.
If the math is right, then declare your technical debt bankruptcy and begin anew!
In some further posts, I’ll explore how to do a well-organized and structured “chapter 11″ on a project, rather than just dropping the ball, including the part that functional tests play in this process, and look at a re-write as a form of highly aggressive refactoring, rather than a whole new project.
Apache Maven is a lot more than a “build tool”, and one of it’s major strengths is it’s ability to manage dependencies.
Maven’s not just for external dependency management, though – it can help us work faster and more easily with our own modules as well as those written by others. In fact, it’s “internal” dependency management is actually far more powerful for most development shops.
Every dependency Maven manages is identified with 3 pieces of information – it’s group id, it’s artifact id, and it’s version. Group id is often some sub-domain of the company it’s working on, e.g. com.point2.somemodule, and the artifact id helps identify the specific module with that group, like rest-api or such.
Possibly the most interesting part is the version number, though, as this is where the real power of Maven comes to the fore. Versions allow us to maximize the opportunity for parallel development without descending into unversioned chaos. Each version represents a specific point in time in a library’s development – and, most importantly, allows us to “re-assemble” our application to a known state at any time (not re-build it).
Let’s take a for-instance to see how this might work…
Component-Based Application “Assembly” For example, let’s say I’ve got a few teams working on different modules for my new application, let’s call them “persistence”, “rest-api” and the user-interface, “ui”. Each of these modules depends on a set of common utility classes in “util”.
We can represent this through a set of triples like so:
rest-api depends-on persistence rest-api depends-on util ui depends-on rest-api persistence depends-on util (directly, and not only on the transitive dependency through persistence)
The unseen aspect here is the versioning. If we include versions in our triples, we see the picture is a bit more sophisticated:
rest-api-1.0 depends on persistence-3.1 rest-api-1.0 depends-on util-1.1 ui-1.0 depends-on rest-api-1.0 persistence-3.1 depends-on util-1.0
Now we have a fully defined dependency graph that we can assemble into an application, say app-1.0. At any time, if we want a copy of the app in 1.0 state, we re-construct it from this deployed modules, no need to build any source code, and we’ve got the exact same app, every time.
Get it in motion… Now let’s look at this in a dynamic development environment, where we’re trying to maximize sustainable velocity:
Although there are dependencies between each module, we don’t want to hold up one team by forcing them to build the other teams modules unnecessarily. We also want each team to choose if they want to work with the very latest version of the other modules, or working against a fixed and stable version for a time instead.
The “ui” team, for example, might be refactoring JavaScript code that’s relying on version 1.0.3 of “rest-api”, while “rest-api” in turn is already working on 1.0.4 – and it uses 1.1.0 of “persistence”… it can get tangled in a hurry without a way to manage it, and we don’t want to be artificially discouraged from writing modular code just because it’s hard to keep version numbers straight.
Enter Maven again. Instead of forcing everyone to just always work with the latest version of every other module (which can bring productivity to a screeching halt in some situations), we allow each time to decide what dependency they will include in their POM (Project Object Model) file.
What if I want the very latest version of “persistence” while I work on “rest-api”, with changes checked in by other developers while I’m still working? This is where the SNAPSHOT version comes into play. Instead of declaring a dependency on 1.1.0, I declare a dependency on 1.1.1-SNAPSHOT. This represents the latest “edge” code for the referenced dependency.
Now we have a graph that looks like this:
app-1.0-SNAPSHOT depends on ui-1.1-SNAPSHOT rest-api-1.1-SNAPSHOT depends on persistence-3.1-SNAPSHOT rest-api-1.1-SNAPSHOT depends-on util-1.1 ui-1.1-SNAPSHOT depends-on rest-api-1.1-SNAPSHOT persistence-3.1-SNAPSHOT depends-on util-1.1
As you can see, we have a mix of stable versioned modules (util in this case), and “on the fly” versions. Yet at the same time we’re assured that major changes that break backwards compatibility will not be seen, as we indicate such changes with a change in our major version number (e.g. 1.X to 2.0).
Then we can set up a CI job (say on TeamCity, Bamboo, or whatever your CI system of choice is) to automatically build and deploy our SNAPSHOT version of “persistence” to our local Maven repository (within our company firewall). The SNAPSHOT version actually turns into a date/timestamped version when it’s deployed to Nexus, and Maven is clever enough to fetch for us the most recent of these SNAPSHOTs every time we build. The “persistence” team checks in some code, CI builds it and deploys the resulting SNAPSHOT jar to our repository, and we get it automagically the next time we build, even though we’re working on rest-api, not persistence.
When we’re ready to “stabilize” our dependencies, we simply switch from the SNAPSHOT to a specific version. Maven has a pre-defined “release” process that guarantees, among other things, that every released version has no remaining SNAPSHOT dependencies, is tagged to version control, and verified via all it’s tests. More than a build tool indeed…
We could of course just put all the modules we’re going to depend on in an aggregator POM, and build everything every time we make a change, but this is hardly efficient, and limits our development velocity unnecessarily (and of course we might not all be in the same source tree, or even the same version-control repository). We want to be building smaller pieces, not bigger ones.
A critical part of this process is our company-local Maven repository – here I mean not just the developer-local repository on each developers own workstation, but a product like Nexus that holds a company-wide copy of all required jars for a build. By doing this, we can guarantee a consistent copy of all our required dependencies without having to depend on the availability of outside repositories, such as ibiblio. It’s not a bad idea to in fact *only* permit access to the local repository for building releases, which ensures this policy is not violated accidentally – while at the same time keeping the “external” maven repo’s available to developers for experimenting and prototyping. Once something gets used in production code, however, it gets stored in the “inside the firewall” Nexus repo (and backed up from there). This avoids the bad practice of checking jar files into source control (it’s called “source” control for a reason).
Testing, Testing… To add a new aspect to the problem, let’s say that it’s not only production code we depend on, but helper classes for tests as well. If it’s difficult to set up a fixture for a certain kind of test, that might be a code smell in and of itself, but that’s also another story. If we have some test helpers that reside in our dependent modules, we won’t be able to see those helpers in our tests in another module, as we’re only depending on that module’s production code, not test code.
We can easily tell Maven to also bundle up the test code from a certain module, however, and make it available to us in a jar file, like so:
<build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-jar-plugin</artifactId> <executions> <execution> <goals> <goal>test-jar</goal> </goals> </execution> </executions> </plugin> ... </plugins> </build>
Now when we build, we’ll get a test jar as well as our regular jar, which we can depend on like so:
<dependency> <groupId>com.point2.core</groupId> <artifactId>somemodule</artifactId> <version>1.0</version> <scope>test</scope> <classifier>tests</classifier> </dependency>
Now our test classes in the module declaring the above dependency can see the test helpers in the somemodule module – but we’re still not including test code in our production jar.
Again, I have to emphasize that this level of coupling might indicate a deeper issue, but if you do need to do this, it’s good to know how
Maven also includes facilities to analyze and clean up a complex dependency tree, remove unnecessary dependencies, and keep the whole project manageable.
In summary, Maven can handle extremely complex dependency management for us in a fully declarative and versioned manner, allowing us at any moment to see exactly what our project depends on, both in production and test code. In conjunction with a CI system (like TeamCity) and repository server (like Nexus), we can automate the deployment of intermediary and full-release versions to the point where we save significant time, and never build code that we’re not actually working on, allowing us to concentrate on the task at hand and leaving the heavy lifting to Maven.
This allows us to only ever build the code we’re actually changing – never code that’s already available in another library, reducing our developer cycle time significantly. It also means we’re spending more time “assembling” software from re-usable components than re-compiling (and probably re-testing) code that’s already verified and available in object form.
Maven: not just for breakfast anymore.
Why do we want to go fast in the first place? Well, if we’re not accumulating technical debt, our quality is still within the bounds we’ve set for ourselves, and we’re satisfying user stories, then we want to be able to accomplish as much as we sustainably can each sprint. This way we can deliver value faster, and people tend to like to pay for that kind of thing
Well, first, we need a road: the basics of an agile environment need to exist before we can go very fast at all. If we’re working on the build system every sprint and trying to get the basics of a story understood, we’re still in road construction, and we shouldn’t expect much in the way of speed until we get some of this basic asphalt laid down and smooth. These includes such basics as a good development setup for our team, a basic understanding of agile principles and a grasp of the technology stack we’re using, and the support of a continuous integration server, to mention a few. If we’re attempting to bounce along over the potholes without setting up a proper environment for rapid delivery of software value, we’ll reap what we sow.
Assuming we’ve built the road, then, what things tend to hold us back? Just like on a real road, the only things stopping us from going faster and faster (to the mechanical limit of our vehicle, or in our case, our keyboards and brains) are either externally-imposed limitations (e.g. a speed limit and cops to enforce it), or our own ability to control the pace without going off the road. In software construction, as in life, we can go off the road in many varied ways, but they all tend to be spectacular, destructive, and painful. Unlike the real road, we can be cruising along for some time before we discover we’ve left the asphalt behind and are sailing over a cliff.
We’ll start by assuming that our corporate environment has eliminated externally imposed speed limits and political roadblocks – not always a safe assumption, but lets assume for the moment that we’re one of the luck developers who work in such a situation.
Our top-level speedometer, to overuse our analogy a bit, is our velocity, measured in features per iteration, or complexity points per iteration – in other words, how much business value are we adding per time period?
The most common way to go off the road is for quality to slip. This can be detected in one of a number of ways, including an ever-increasing defect rate. If most of your sprint is taken up by fixing defects or paying off code debt, then you’re probably trying to go too fast (or you’ve not finished laying the road after all). Of course, it’s possible you’ve just got a few bad drivers on your team, but we’ll assume that’s easier to see (if not necessarily easier to fix). What’s worse than seeing quality slip? Not seeing quality slip, even though it is. We can’t measure quality directly, per se, but we sure can measure a lot of other things. Once we know what the normal position of each guage is (e.g. once we establish reasonable code standards that we can measure), then we can watch them to get early warning of things going awry.
Just like on the real road, we need two categories of things, it seems, to help us go as fast as safely possible: I’ll call them headlights and guardrails.
Headlights
The most basic tools here are user stories, acceptance criteria/tests (ideally executable ones), and metrics such as defect rate and velocity measurements. None of these are trivial or straightforward, and it’s easy to think you’ve got a good view and suddenly discover you’ve been accumulating code debt without realizing it. The only proper reaction at that point is to slow down and correct the problem, as we’ll discuss below.
A good business understanding of the goals and epics behind our user stories gives us more range to see further ahead, and going fast requires looking further ahead, while at the same time paying attention to where you are at the moment.
Just like when driving we must be aware of the road immediately ahead, our user stories give us the close-focus we need to be doing the immediately useful thing. We can’t discard these in favor of looking further ahead exclusively, or we’ll never get to where we want, but we can combine that with an awareness of both the near future and an understanding of the overall destination to make better decisions in our day-to-day work.
If we concentrate exclusively on the user stories in hand for each iteration we can find we’ve lost sight of the forest, and may have a hard time fitting together features that should blend into an overall product. If we concentrate only on the distant horizon and not on the user story we’re working on we’ll never get anything done. The proper balance lets us go fast.
We don’t want to be like the driver in the joke with the punch line that ends “we’re lost… but we’re making bloody good time”!
Looking a bit further ahead also allows us to anticipate curves and obstacles in the road, and be ready to hand them when they arrive. If we know, for instance, from our long-range planning that we intend to scale our application to thousands of users, we might make different decisions than if we’re aware that a single user on a desktop box is the intended audience – even though neither of these factors is really represented directly by each user story we work on.
Executable acceptance tests from a tool like Greenpepper, Fitness, RSpec, or the like can be valuable headlights, freeing developer time from the repetitive manual verification and allowing BA/Customer Proxies to have control over the acceptance process – again freeing up developers to develop, and maximizing team velocity. As was mentioned in a recent stand-up meeting: if you’ve manually tested once, you’ve probably already spent more time than it takes to set up an automated test to do the same thing repeatedly, not to mention you’ve probably enjoyed it a lot less
Guardrails
There’s a big difference between guardrails and a stone wall built across the road ahead, however – it’s not hard to let a testing tool or technique turn into a straightjacket, with tons of brittle and hard-to-maintain tests that don’t help us at all. We need the right tool for the right job, and used the right way.
If we have guardrails ensuring the basics of our code quality, we can go faster with the confidence that when we look back at the end of each sprint we will not have accumulated more code debt that needs to be paid back later. For example: if we establish a test coverage metric that ensures we have a breaking build if our code coverage goes below a certain minimum level (I propose this always be 100%, but that’s another post), we can move forward with the assurance that there’s no code that’s being left untested, so we won’t find ourselves in the distinctly non-TDD-like position of having to go back and write tests for existing code, burning time that should be able to be used for the next story.
We can also refactor with better confidence if we know for a fact there are tests watching over our shoulders, ready to break should our refactor not be true. Refactoring code that is, at least in part, untested should always be an unacceptable risk.
If we have some checkstyle, PMD, FindBugs or other static analysis tools checking that our cyclomatic complexity is within bounds, that our class size and line length are readable, and other critical maintainability and coding standards factors are met, we can plunge forward without the fear of a huge cleanup being required just to make the code understandable down the road a ways.
Of course, just like guardrails and headlights are not infallible in the real world, all the tools and checks in the world don’t ensure good quality code. One area that’s particular hard to ensure quality within via automatic mechanisms is design. You can have code that’s 100% covered, passes every checkstyle rule known to man, and still represents a terrible design. This is where the human factor comes into play – the automation merely ensures that you’re spending valuable human attention span on the stuff that really requires a brain, as opposed to things that can be verified mechanically.
Discipline is the glue that makes all of this work together – often times developers themselves will have the “smell” of something done not quite right, but not feel like they’ve got the latitude to dig into it and clean it up, so they save it until the mythical “later”, which sometimes never comes. Management and team leads must also be disciplined enough to have the patience while that kind of refactor happens – with the firm knowledge that they’ll get paid back by better productity and a lower defect rate over the mid to long term.
A final warning: It’s easy to let headlights become leashes and for guardrails become cubicle walls. Many agile practitioners are concerned, and rightly so, that adding tools and techniques can turn into a new dogmatism and inflexible methodologies. It’s up to us in the trenches to make sure we don’t let this happen, while at the same time getting all the juice we can out of helpful techniques and tools.
Properly applied, though, headlights and guardrails can be valuable tools in letting us reach our maximum velocity, while still arriving safely at our destination.