»
S
I
D
E
B
A
R
«
ServiceMix 4: Getting on the Bus
Aug 18th, 2009 by Mike

The Apache Group have recently released version 4 of the ServiceMix ESB. (Enterprise Service Bus), and I had a chance to work with it a bit over the weekend.

As we develop more and more services, the plumbing is starting to get complicated. We’re starting to ask questions like what versions of what services do we have? What dependencies do we have between services? How can/should we deploy services? Should we use REST or JMS for this service? How can we easily manage and monitor all these services in a fully clustered, scalable and high-availability environment?

In short, how can we develop powerful scalable services fast and not worry about the plumbing?

We’re not the only people to have asked these questions – and one powerful answer is to use an ESB.

ESB’s have come a long way, and the JBI standard and OSGi finally get together in this version of ServiceMix, and some pretty powerful magic happens as a result.

Just about every organization that writes sophisticated applications, particularly Java applications, have run into some of the problems that an ESB provides solutions for – the same is true of OSGi.

As Anthony Juckel put it on his blog, “Any sufficiently complicated Java program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of OSGi.“. This is quite true, in my experience, and can even be extended to include ESB’s, in that if you’ve got a number of JVM services interacting on a network to solve a problem, you’ve got some kind of an ESB going on, whether you call it that or not.

A common practice recently is to write software that provides “services”, then to combine these services into a complete system – in other words, we write services, but we compose systems (from these services). The awkward part comes (well, one awkward part) when we try to write the service in such a way that it can be used in many different ways – in other words, to maximize the potential for re-use – while at the same time trying to do the simplest thing that can possibly work.

No matter what service mechanism we choose, we lock ourselves in to some degree. If we write our service to use JMS (Java Messaging System), so we can support nice events and queues and other asynchronous goodness, we can’t easily then use our service from, let’s say, a client that looks for a REST service. If we choose SOAP, we can’t easily talk to our service from a client (which could be another service) that wants to post an event, instead of call a SOAP interface.

Normalized Messages
The Java JBI (Java Business Integration) standard answers this problem: it provides a single message-oriented interface for services, called the Normalized Message. This Normalized Message contains some header info, some properties, and XML payload, and optionally, binary attachments. It is not specific to any one protocol – in other words, it’s not SOAP, it’s not REST, it’s not JMS, it’s not SMTP, it’s not any of those.

JBI is a JSR for defining Java Business Integration, a way to cut through the complexity of different types of communications mechanisms for component message-passing (e.g. things such as REST, JMS, SOAP, RPC, email, FTP, HTTP, file system, and so on).

ServiceMix is one of several choices – once we have services written to the JBI standard, we can deploy them in any JBI-compliant ESB. Glassfish’s OpenESB is another option, for example.

Instead of worrying about the plumbing of the specific protocol we want to use, we can concentrate on writing our business logic, then rely on the ESB to “wire up” the messages between our service-providing components – whether they’re local (e.g. running in the same ESB, in which case it’s a simple in-memory message), or distributed, on another box in our cluster, or halfway around the world. The ESB provides adapters for each of the different protocols – so we can take our one service and talk to it via REST, JMS, SOAP, Email, carrier pigeon (ok, maybe not the last one – but you could write an adapter!).

This means I can write my service without even knowing what kind of protocol I’m going to be talking to it with – maybe it receives some messages via JMS, some via REST calls, a couple of SOAP services, and one in a while via an email. All that is the problem of the ESB – I just write one simple method in one simple interface to swallow the appropriate NormalizedMessage, do my processing whatever it is, and spit back out another NormalizedMessage.

This can significantly speed up the development of new services, by taking decisions about the plumbing away from the process of writing the correct business logic. We don’t even need to decide up-front if our service will be used by other services or not – we can wire it up as required.

The other important thing about a Normalized Message is that it refers to services by an identifier, but not by a specific location – in other words, a message from a client might say “I need this service to process this message”, but it’s up to the bus to figure out how to get the message to the service – this provides the opportunity to decouple services from each other. Let’s say you need to write a service that takes some kind of business message and sends an email. You don’t have to worry about passing your service information to let it find the email service at all – you just send the message to a service name (maybe “email”), and the ESB figures out the rest.

This means you can swap the email service out for another service that provides the same mechanism without worrying about the details.

JBI and OSGi Together
ServiceMix is one example of such an ESB, which marries the modularization advantages of OSGi with the above-described benefits of an ESB. This means each of our components can run in it’s own classpath space, with no interference with other components – even other components that might use different versions of the same jars. How many times have you discovered you’ve got 3 different versions of log4j on your classpath? This can lead to all sorts of weird and wonderful problems that only occur in certain circumstances, and are all-but-impossible to fix. By providing true modularization, OSGi solves this problem properly. Other solutions that I’ve seen applied in the past lead directly to the quote about every complex system having a half-baked OSGi implementation inside them :)

ServiceMix also provides a whack of existing components in two different JBI flavors: Service Engines and Binding Components. Services Engines are the ones that do the actual business logic, and Binding Components are the adapters, the pieces that route messages to and from various protocols. Many common tasks can be achieved by configuring the existing pieces, without writing a line of code.

When it is necessary to write some code, ServiceMix includes adapters to make that even easier as well – for instance, it provides a Service Engine to allow any POJO (plain old java object) to be used as a service, without that object having to actually implements the interface for handling NormalizedMessages directly. It doesn’t get any easier than that to write a service – just write the bean, plug in a bit of configuration, and you’re done – instance REST (JMS, email, SOAP, etc etc) service!

Spring integration is built into ServiceMix, so SE’s can have their dependencies injected automagically on deployment.

SE’s can be hot-deployed and hot-removed. Multiple versions of a single SE can be running without conflict, so “hot” upgrades can be done easily. You can have one version of a service that requires version 1.1 of another service running at the same time as some other service that requires 1.2 of the same service. This is flexibility in deployment, as it means that there’s no need to “drag along” other services when upgrading if you don’t want/need to.

Services can easily be managed and monitored via the administrative capabilities of the bus.

Binding Components
Binding components are how we talk outside the backbone – they take messages from the backbone and transmit or receive via an external protocol to non-ESB services – e.g. via REST, JMS, HTTP, file system, etc etc. There are dozens of existing binding components to support just about every protocol we’d care about.

Of course, all this power comes with a price: you have to learn to manage and deploy your ESB of choice, and even the simplest ESB is still a fairly complex beast. ServiceMix is no exception. The good news, however, is that it’s possible to pretty much ignore much of the available complexity in order to get started small, then learn more as you need it to start to leverage the power available.

A caveat, however: It’s important to use an ESB the way it’s intended, and not try to shoehorn things that don’t fit into it’s structure. If you find yourself struggling to write services, or having to write a lot of Binding Components, chances are you’re making inappropriate design decisions, or that you haven’t separated the concerns of a Service Engine from a Binding Component fully. This is not uncommon, as in many environments these two concerns are handled by a single component. If you’re using to writing REST services, for instance, with things such as JRA or Jersey, you’ll find it very odd to separate the processing from the “presentation” (even when that presentation is simply into XML or JSON to the REST client).

Once this technique becomes second nature, however, the true power of an ESB becomes clearer.

A whole fleet of buses…
Multiple ESB instances can be deployed and clustered, providing highly scalable and fault-tolerant system. The communication between ESB instances is handled entirely by the ESB itself – nothing about our components needs to be aware that they’re working in a clustered environment.

From my latest look at ServiceMix and it’s new release, I suspect I’ll be taking a deeper dive in the near future.

(Thanks to Craig Walls for pointing out that excellent quotation – and for all his help in understanding OSGi!)

Specs and Selenium together
Aug 9th, 2009 by Mike

I recently had the chance to dive into a new project, this one with a rich web interface. In order to create acceptance test around the (large and mostly untested) existing code, we’ve started writing specs acceptance tests.

Once we have our specs written to express what the existing functionality is, we can refactor and work on the codebase in more safety, our tests acting as a “motion detector” to let us know if we’ve broken something, while we write more detailed low-level tests (unit tests) to allow easier refactoring of smaller pieces of the application.

What’s interesting about our latest batch of specs is that they are written to express behaviours as experienced through a web browser – e.g. “when a user goes to this link and clicks this button on the page, he sees something happen”. In order to make this work we’ve paired up specs with Selenium, a well-known web testing framework.

By abstracting out the connection to Selenium into a parent Scala object, we can build a DSL-ish testing language that lets us say things like this:

object AUserChangesLanguages extends BaseSpecification {

  "a public user who visits the site" should beAbleTo {
    "Change their language to French" in {
      open("/")
      select("languageSelect", "value=fr")
      waitForPage
      location must include("/fr/")
    }
    "Change their language to German" in {
      select("languageSelect", "value=de")
      waitForPage
      location must include("/de/")
    }
    "Change their language to Polish" in {
      select("languageSelect", "value=pl")
      waitForPage
      location must include("/pl/")
    }
  }
}

This code simply expresses that as a user selects a language from a drop-down of languages, the page should refresh (via some Javascript on the page) and redirect them to a new URL. The new URL contains the language code, so we can tell we’ve arrived at the right page by the “location must include…” line.

Simple and expressive, these tests can be run with any of your choice of browsers (e.g. Firefox, Safari, or, if you insist, Internet Explorer).

Of course, there’s lots more to testing web pages, and we’re fleshing out our DSL day by day as it needs to express more sophisticated interactions with the application.

We can get elements of the page (via Xpath), make assertions about their values, click on things, type things into fields and submit forms, basically all the operations a user might want to do with a web application.

There are some frustrations, of course. The Xpath implementation on different browsers works a bit differently – well, ok, to be fair, it works on all browsers except Internet Exploder, where it fails in various frustrating ways. We’re working on ways to overcome this that don’t involve having any “if browser == ” kind of logic.

It’s also necessary to start the Selenium RC server before running the specs, but a bit of Ant magic fixes this.

We’ve got these specs running on our TeamCity continuous integration server, using the TeamCity runner supplied with Specs, where we get nicely formatted reports as to what’s pending (e.g. not finished being written yet), what’s passing, and what’s failing.

The specs written with Selenium this way are also a bit slow, as they must actually wait in some cases for the browser (and the underlying app!) to catch up. When run with IE as the browser, they’re more than just a bit slow, in fact…

They are, however, gratifyingly black-box, as they don’t have any connection to the code of the running application at all. For that matter, the application under test can be written in any language at all, and in this case is a combination of J2EE/JSP and some .NET.

There’s a lot of promise in this type of testing, even with it’s occasional frustrations and limitations, and I suspect we’ll be doing a lot more of it.

Acceptance Testing
Jun 30th, 2009 by Mike

Recently I’ve had the opportunity to consider the right qualities of a tool and/or framework for acceptance testing.

Acceptance tests can be found at a number of different levels, depending on how the story criteria is expressed. (See my previous post on levels of testing) . Often they’re called “executable specifications” if they’re written in such as way as to describe the behavior of a system in a given scenario.

Often they are functional or integration tests, and generally they are best expressed as “black box” tests, that is, tests that have no awareness of the internals of the code or component being tested – all they see is what goes in and what comes out, and they assert their success or failure based on those elements alone.

An acceptance test must be comprehensible to the story author, or whatever domain expert is going to actually do the “accepting” that the story has been satisfied. If they simply take the word of a developer that this big ball of code they see in front of them is testing what they said it should, that’s not as good as if they can actually read and understand the executable specification (test) themselves. Ideally, they should even be able to author their own acceptance tests, without a developer involved – perhaps using existing tests as an example.

So, some of the criteria for a good tool might be:

  1. High Visibility: Something that’s easy to see by all stakeholders
  2. Easily Understood: Something that’s easy to at least read by any non-developer domain expert
  3. Easily Authored: Something that’s easy to learn to write
  4. “Black Box”: Something that has no knowledge of the internals of the code under test, that only interacts with it from “outside the box”
  5. Realistic: Something that tests a version of the application deployed in a realistic environment, as close to production as possible.

Given these criteria, should the tests be part of the build process for the piece of code under test? For acceptance tests that span modules, this is not practical – you can’t actually run the test when you build a module, you can only run it after a certain set of modules has been built (and perhaps even deployed).

I would propose that acceptance tests don’t even belong with the build they’re testing. I think they belong elsewhere, in an independent location where they can be updated as stories are written and verified. Ideally, there should be a shared space for stories for an entire system or suite of applications, as often a piece of criteria spans multiple components – so organizing the tests by component is artificial at best.

Another question that comes up when considering tools for these kinds of tests is whether or not they must be stored and versioned with the code they validate. As a developer, my immediate reaction is “yes” – until I think about it a bit. How does it help me to know what tests a previous version of my code passed or did not pass? All that tells me is where I was in the past, not where I am now. How much inconvenience am I willing to put up with on the part of test authors to get this capability? Dealing with any version control system is extra work, especially for a domain expert/BA who’s not also a developer. I’m now convinced that where I am now is more important than where I was, and being able to easily write and run and make visible tests is more important to me than knowing what happened back in history.

Let’s examine a few different tools and ways of doing acceptance tests and look at the pros and cons:

JUnit
Developers working with Java have a choice of a number of excellent test frameworks, including TestNG and JUnit. They are likely already familiar with them, and the tests fit nicely into the build cycle of Java applications built with any of the more popular project build and management tools, such as Ant, Buildr, Maven, and so forth.

Our acceptance tests are written just like a functional test, in that they fire up whatever context our application should run within, then provide input to it and verify the output with assertions.

This kind of test is not well-suited for acceptance tests for a number of reasons. First, it’s hard to separate the code from the test to ensure we’re truly doing black-box testing. If, for example, we’re within the same VM as the thing we’re testing, it’s very tempting to manipulate objects directly in our test, as opposed to going through, lets say, the publish REST api. This also makes it more difficult to initialize a new testable instance of the application – to be truly independent from it, our test should spawn a whole new JVM from the system under test – which is non-trivial from within Java, although of course re-usable fixtures can be written to take the sting out of it.

This doesn’t help us much when our test spans multiple modules or components, however – then we must either create stubs for each of the components we depend on, or work in a much more complex deployment process to create a “live” version of each of our dependencies.

Finally, a JUnit test is not particularly readable by a non-developer, and not particularly visible, other than as a green or red bar in an IDE or CI server somewhere, so it fails a couple of our most important criteria.

JBehave, easyb, Specs, RSpec
BDD frameworks such as those listed above take our testing power in a different direction. They mostly rely on sophisticated DSL’s, enabling us to write our tests in a much more english-like fashion, often quite readable to the non-developers who have taken the time to learn a bit of the DSL.

They still suffer from some of the other problems described above, however – although some of these tools do offer better output formats to make their results more visible (such as the excellent Forms capability in specs, which can show test results in HTML table formats).

They of course still have a learning curve, as the DSLs in each case are another whole language to be learned.

FitNesse
A different approach can be seen in tools like FitNesse, which allows tests to be created in a Wiki, by editing web pages and inserting special markup to call test “fixtures”. These fixtures still have to be customized to the situation, of course, but with FitNesse we have the potential of being de-coupled from a single module or system under test, and of developing our tests independently of the code altogether.

The table approach of FitNesse still requires the test author to understand the fixtures available to FitNesse – this is essentially FitNesse’s DSL – but of course only a fairly small number of fixtures need be learned to be able to write a wide variety of tests.

Some projects, however, in an attempt to retain the ability to use FitNesse tests to verify previous versions of their application, take the step of checking the FitNesse tests in to their version control system, thereby losing some of that independence.

FitNesse has it’s own ability to track changes to it’s Wiki pages (and thus it’s tests), but this is not tied to the checkins or releases of the software under test.

FitNesse makes it easy for a non-developer domain expert to author tests by using existing tests as templates, then changing the inputs to the fixtures being used to create new tests from the existing building blocks.

These tests and their results are highly visible, especially if the FitNesse wiki is hosted and available to anyone (including developers) to use at any time to verify against a specific deployed instance of the entire suite of applications.

Of course, you’ve still got the issue of deploying a testable instance of your system to deal with in FitNesse. One approach I’ve seen to solve this is to actually have a fixture that can deploy the testable instance as part of the set up for the whole test suite, using versions to indicate the revision of code to be tested – e.g. you have a fixture that says “deploy module X version 123 to test environment 1″, “deploy module Y version 345 to test environment 1″ and so forth, then executes its tests against those deployed instances.

Of course, any database cleanup/reset to get to known state can happen before or just after the deployment, so your tests always start from a known point.

An important part of making this fully automatable is a mutex service: a way to make a call to a known location and say “check out test environment 1″, basically saying that test environment 1 is now busy until it’s released by another call. This ensures that you don’t start another test run on the same environment while it’s still in a transition state.

The mutex can also of course report on who has what environment in use, and since when, to detect failures that might not release a lock.

FitNesse has it’s warts, however, even in a scenario like this, but overall I’ve seen it succeed more often that other approaches for the high-visibility acceptance tests that many projects need, while tools such as easyb, specs, and RSpec are better for describing behaviours within a single module, and executing as part of that module’s build process.

Levels of Testing
May 25th, 2009 by Mike

I’ve had reason recently to do some thinking on the various “levels” of software testing. I think there’s a rough hierarchy here, but there’s some debate about the naming and terminology in some cases. The general principals are pretty well accepted, however, and I’d like to list them here and expound on what I think each level is all about.

An important concern in each of these levels is to achieve as high a level of automation as possible, along with some mechanism to report to the developers (or other stakeholders, as required) when tests are failing, in a way that doesn’t require them to go somewhere and look at something. I’m a big fan of flashing red lights and loud sirens, myself :)

Unit
Unit testing is one of the most common, and yet in many ways, misunderstood levels of test. I’ve got a separate rant/discussion in the works about TDD (and BDD), but suffice it to say that unit-level testing is a fundamental of test-driven development.

A unit test should test one class (at most – perhaps only part of a class). All other dependencies should be either mocked or stubbed out. If you are using Spring to autowire classes into your test, it’s definitely not a unit test – it’s at least an integration test. There should be no databases or external storage involved – all of those are external and superfluous to a single class that you’re trying to verify is doing the right thing.

Another reason to write comprehensive unit tests is that it’s the easiest place to fix a bug: there are fewer moving parts, and when a simple unit tests breaks it should be entirely clear what’s wrong and what needs to be fixed and how to fix it.

As you go up the stack to more and more complex levels of testing, it becomes harder and harder to tell what broke and how to fix it.

Generally unit tests for any given module are executed as part of every build before a developer checks in code – sometimes this will also include some functional tests as well, but it’s generally a bad idea for any higher-level tests to be run before each and every check-in (due to the impact on developer cycle-time). Instead, you let your CI server handle that, often on a scheduled basis.

Functional
Some people suggest that functional and integration are not two separate types, but I’m separating them here. The key differentiation is that a functional test will likely span a number of classes in a single module, but not involve more than one executable unit. It likely will involve a few classes from within a single classpath space (e.g. from within a single jar or such). In the Java world (or other JVM-hosted languages), this means that a functional test is contained within a single VM instance.

This level might include tests that involve a database layer with an in-memory database, such as hypersonic – but they don’t use an *external* service, like MySQL – that would be an integration test, which we explore next.

Generally in a functional test we are not concerned with the low-level sequence of method of function calls, like we might be in a unit test. Instead, we’re doing more “black box” testing at this level, making sure that when we pour in the right inputs we get the right outputs out, and that when we supply invalid input that an appropriate level of error handling occurs, again, all within a single executable chunk.

Integration
As soon as you have a test that requires more than one executable to be running in order to test, it’s an integration test of some sort. This includes all tests that verify API contracts between REST or SOAP services, for instance, or anything that talks to an out-of-process database (as then you’re testing the integration between your app and the database server).

Ideally, this level of test should verify *just* the integration, not repeat the functionality of the unit tests exhaustively, otherwise they are redundant and not DRY.

In other words, you should be checking that the one service connects to the other, makes valid requests and gets valid responses, not comprehensively testing the content of the request or response – that’s what the unit and functional tests are for.

An example of an integration test is one where you fire up a copy of your application with an actual database engine and verify that the operation of your persistence layer is as expected, or where you start the client and server of your REST service and ensure that they exchange messages the way you wanted.

Acceptance
Acceptance tests often take the same form as a functional or integration test, but the author and audience are usually different: in this case an acceptance test should be authored by the story originator (the customer proxy, sometimes a business analyst), and should represent a narrative sequence of exercising various application functionality.

They are again not exhaustive in the way that unit tests attempt to be in that they don’t necessarily need to exercise all of the code, just the code required to support the narrative defined by a series of stories.

Fitnesse, RSpec and Green Pepper are all tools designed to assist with this kind of testing.

Concurrency
If your application or service is designed to be used by more than one client or user, then it should be tested for concurrency. This is a test that simulates simultaneous concurrent load over a short period of time, and ensures that the replies from the service remain successful under that load.

For a concurrency test, we might verify just that the response contains some valid information, and not an error, as opposed to validating every element of the response as being correct (as this would again be an overlap with other layers of testing, and hence be redundant).

Performance
Performance, not to be confused with load and scalability, is a timing-based test. This is where you load your application (either with concurrent or sequential requests, depending on it’s intended purpose) with requests, and ensure that the requests receive a response within a specified time frame (often for interactive apps a rule is the “two second rule”, as it’s thought that users will tolerate a delay up to that level).

It’s important that performance tests be run singly and on an isolated system under a known load, or you will never get consistency from them.

Performance can be measured at various levels, but is most commonly checked at the integration or functional levels.

Load/Scalability
A close relative of, but not identical with concurrency tests are load and/or scalability tests. This is where you (automatically) pound on the app under a simulated user (or client) load, ideally more than it will experience in production, and make sure that it does not break. At this point you’re not concerned with how slow it goes, only that it doesn’t break – e.g. that you *can* scale, not that you can scale linearly or on any other performance curve.

Quality Assurance
Many Agile and Lean teams eschew a formal quality assurance group, and the testing such a group does, in favor of the concept of “built in” QA. Quality assurance, however, goes far beyond determining if the software perform as expected. I have a detailed post in the works that talks about how else we can measure the quality of the software we produce, as it’s a topic unto itself.

Alpha/Beta deployments
Not strictly testing at all, the deployment of alpha or beta versions of an application nonetheless relates to testing, even though it is far less formalized and rigorous than mechanized testing.

This is a good place to collect more subjective measures such as usability and perceived responsiveness.

Manual Tests
The bane of every agile project, manual tests should be avoided like the undying plague, IMO. Even the most obscure user interface has an automated tool for scripting the testing of the actual user experience – if nothing else, you should be recording any manual tests with such a tool, so that when it’s done you can “reply” the test without further manual interaction.

At each level of testing here, remember, have confidence in your tests and keep it DRY. Don’t test the same thing over and over again on purpose, let the natural overlap between the layers catch problems at the appropriate level, and when you find a problem, drive the test for it down as low as possible on this stack – ideally right back to the unit test level.

If all of your unit test are perfect and passing, you’d never see a failing test at any of the other levels, theoretically. I’ve never seen that kind of testing nirvana achieved entirely, but I’ve seen projects come close – and those were projects with a defect rate so low it was considered practically unattainable by other teams, yet it was done at a cost that was entirely reasonable.

»  Substance: WordPress   »  Style: Ahren Ahimsa