<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Mike Nash's Two Cents Worth</title>
	<atom:link href="http://php.jglobal.com/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://php.jglobal.com/blog</link>
	<description>Explorations in Software Development</description>
	<lastBuildDate>Fri, 04 May 2012 18:41:54 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Why I don&#8217;t Mock much</title>
		<link>http://php.jglobal.com/blog/?p=1347</link>
		<comments>http://php.jglobal.com/blog/?p=1347#comments</comments>
		<pubDate>Fri, 04 May 2012 18:41:54 +0000</pubDate>
		<dc:creator>Mike</dc:creator>
				<category><![CDATA[Scala]]></category>
		<category><![CDATA[Software Testing]]></category>

		<guid isPermaLink="false">http://php.jglobal.com/blog/?p=1347</guid>
		<description><![CDATA[I promised some time back to provide to some of my team members parts the background material I use to reason my way to my current aversion to mocking, and the reason I prefer stubs over mocks for testing applications.
It turned out this was a longer road than I had anticipated, but reviewing my reasoning [...]]]></description>
			<content:encoded><![CDATA[<p>I promised some time back to provide to some of my team members parts the background material I use to reason my way to my current aversion to mocking, and the reason I prefer stubs over mocks for testing applications.</p>
<p>It turned out this was a longer road than I had anticipated, but reviewing my reasoning was helpful, so I&#8217;m finally providing some of the reading material related to it here, as well as my own conclusions for consideration.</p>
<p>First, it&#8217;s good to understand the differences between stubs and mocks (and other kinds of test fakes). Martin Fowler explains this quite precisely <a target="_new" href="http://martinfowler.com/articles/mocksArentStubs.html">in this article.</a></p>
<p>Sub-call verification (e.g. orchestration testing) is an anti pattern and impedes refactoring. Mocks use behaviour verification at a low level, and encouraging testing it.</p>
<p>Another description of the difference, with a slightly different take is <a target="_new" href="http://blog.callistaenterprise.se/2010/11/12/stubs-n-mocks/"> in this article.</a></p>
<p>Another analysis of the difference, and the anti patterns imposed by mocking <a target="_new" href="http://www.disgruntledrats.com/?p=620">is given in this article</a></p>
<p>A succinct description of the coupling issue <a target="_new" href="http://thought-tracker.blogspot.ca/2006/05/should-mock-objects-be-considered.html">is discussed here</a></p>
<p>In addition to the issues described above, there are two pragmatic issues that affect my preference for mocking actually depend more on my current language of choice: Scala. One issue is that the most capable mocking framework I know of, Mockito, doesn&#8217;t play well with Scala &#8211; it can&#8217;t, for example, properly mock methods defined in a trait. As a result, because I TDD, I&#8217;m tempting to not use some of the more powerful features of Scala just because my mocking framework of current choice doesn&#8217;t handle them well. Of course, new frameworks coming along, like ScalaMock, might make this argument moot. When I found myself creating behaviour-less traits just to provide easier mocking in my tests, I realized there was a problem.</p>
<p>The other issue comes from the fact that it&#8217;s so darned easy to do stubs in Scala &#8211; given the low-ceremony ability to override and create closures and dynamic instances, it&#8217;s been my experience that mocks are actually more code and less comprehensible in Scala. This is not at all the case in Java, however, where mocks can be massively more succinct than stubs.</p>
<p>I subscribe to the original intent of object-orientation, which is that objects are encapsulations of state and behaviour, not data structures, and that the way to communicate between them is messages, not method calls. As a result, I find the verification of behaviour at the low level counterproductive. I prefer to verify the behaviour of objects (services, domain objects, etc) at a high level &#8211; e.g. does this thing DO what I want it to do, as opposed to asking &#8220;does it do the thing I&#8217;m testing the WAY I think it does&#8221;. These are two very different questions.</p>
<p>At the same time, I have a strong preference for functional programming, which also supports easier testing, making the issues of mocking less critical, as the stubs I need to test more fully functional code are trivial compared to what highly procedural code using mutable objects requires.</p>
<p>After all, why do we test? This is a larger question, and a topic for another blog post, but for me, the answer is two-fold: to give me confidence that the system does what I specify it to do, with some level of confidence, and to allow me to refactor freely and frequently while maintaining that confidence. As a result, the last thing I want to do is to ensure the system does what it does <i>the same way</i> it did it a minute ago &#8211; I don&#8217;t care how it does the thing it does, only that it does. In fact, I must <i>not</i> care about the how, or I can&#8217;t refactor without re-writing large hunks of my tests.</p>
<p>This preference to thinking about the <i>what</i> instead of the <i>how</i> is also, incidentally, what causes my preference for using Guice when managing services and dependency injection: because Guice allows me to quickly add new dependencies and change constructor signatures without having to go back and re-write tests, it allows me to go faster than manual module creation does, so I prefer it.</p>
<p>Whether I have annotations telling me my code is injected or a module class somewhere doing the injection manually for me is less interesting, although at least the annotations tell me <i>in the class I&#8217;m looking at</i> that dependency injection is taking place.</p>
<p>In any event, that preference is a side issue to the mocking/stubbing preference.</p>
<p>I hope this material provides some food for thought.</p>
]]></content:encoded>
			<wfw:commentRss>http://php.jglobal.com/blog/?feed=rss2&amp;p=1347</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Software Craftsmanship: Principles and Practices</title>
		<link>http://php.jglobal.com/blog/?p=1342</link>
		<comments>http://php.jglobal.com/blog/?p=1342#comments</comments>
		<pubDate>Sun, 29 Apr 2012 19:21:40 +0000</pubDate>
		<dc:creator>Mike</dc:creator>
				<category><![CDATA[Software Craftsmanship]]></category>

		<guid isPermaLink="false">http://php.jglobal.com/blog/?p=1342</guid>
		<description><![CDATA[For a long while now I&#8217;ve been musing on this blog about some of the ideas surrounding what we call &#8220;software craftsmanship&#8221;. After all this musing, I&#8217;ve realized that the topic is large enough and important enough to explore more deeply.
As many in the industry have observed, software is getting (arguably, already is) incredibly important [...]]]></description>
			<content:encoded><![CDATA[<p>For a long while now I&#8217;ve been musing on this blog about some of the ideas surrounding what we call &#8220;software craftsmanship&#8221;. After all this musing, I&#8217;ve realized that the topic is large enough and important enough to explore more deeply.</p>
<p>As many in the industry have observed, software is getting (arguably, already is) incredibly important to the normal functioning of our daily lives, and it appears this trend is likely to continue for some time. This alone, even if not for the many other reasons, is sufficient to take software craftsmanship seriously, and to explore it&#8217;s meaning and it&#8217;s disciplines.</p>
<p>I&#8217;ve had the good fortune to make my living in the software development industry for almost three decades (it&#8217;ll be 30 years this June, if I remember correctly), and in that time I&#8217;ve seen what I would call good craftsmanship, and I&#8217;ve seen what I will charitably refer to as &#8211; well &#8211; &#8220;not&#8221; good craftsmanship, and I&#8217;ve tried to pay attention to the differences between them. I&#8217;ve seen some technologies and techniques come, then go, then in some cases come again, a couple even for the third time. As has been many times observed, if we do not study history, however brief, we are doomed to repeat it &#8211; for better or worse.</p>
<p>I&#8217;ve seen what works, and what doesn&#8217;t, and I&#8217;ve thought and talked about why, and I&#8217;ve decided to try to summarize those experiences into an exploration of the subject, concentrating on the underlying principles or beliefs of good software craftsmen and the practices that usually derive from those principles.</p>
<p>I&#8217;m going to summarize my musings here in the blog, but I&#8217;m also planning on writing a full-length book on the subject. Tentative title &#8220;Software Craftsmanship: Principles and Practices&#8221;. I&#8217;m not sure yet if it&#8217;ll be only an eBook or if dead trees will be involved, but that&#8217;s just the details of delivery.</p>
<p>Any work of this nature is going to be far more valuable to other developers if it includes as many perspectives as practical, so I welcome input, comments, collaboration, discussion, dissent, digression, debate, disbelief and all that other stuff as well.</p>
<p>My audience for this book will be the aspiring software craftsman. Software is in the unfortunate position of being such a new discipline that there&#8217;s not a large body of knowledge easily grasped by the apprentice that&#8217;s not specific to an area of technology. To use a medical analogy, we&#8217;ve got books on treating broken fingers aplenty, but not that many on the general experience of being a doctor. Of course, if we imagine a world where the idea of doctors has only been around for fifty years or so, this lack is more understandable. Still, how can we develop more good software craftsmen (and craftswomen, of course), if those of us that have been kicking around for a while don&#8217;t record and share our experiences?</p>
<p>In any event, please expect a torrent of thinking on this topic to be poured out here starting soon, and please drop me a line if you&#8217;d like to get involved in the process. You&#8217;ve been warned!</p>
]]></content:encoded>
			<wfw:commentRss>http://php.jglobal.com/blog/?feed=rss2&amp;p=1342</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Terrible Case of the Incrementing Numeric Key</title>
		<link>http://php.jglobal.com/blog/?p=1331</link>
		<comments>http://php.jglobal.com/blog/?p=1331#comments</comments>
		<pubDate>Tue, 24 Apr 2012 02:12:52 +0000</pubDate>
		<dc:creator>Mike</dc:creator>
				<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://php.jglobal.com/blog/?p=1331</guid>
		<description><![CDATA[I&#8217;ve recently had a chance to get kicked again in a place I&#8217;d rather not get kicked by one of the many horrible side-effects of an incrementing numeric key.
This has made me want to once again rant and rave about the many negative attributes of this pattern. Instead, I&#8217;m going to slow down and try [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently had a chance to get kicked again in a place I&#8217;d rather not get kicked by one of the many horrible side-effects of an incrementing numeric key.</p>
<p>This has made me want to once again rant and rave about the many negative attributes of this pattern. Instead, I&#8217;m going to slow down and try to enumerate them more clearly <img src='http://php.jglobal.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>First, let&#8217;s examine what pattern we&#8217;re talking about: In it&#8217;s most common form, it is the pattern of assigning an incrementing numeric value as the synthetic key to a persistent entity of some sort, often in a relational database. One of the reasons it&#8217;s often found in relational databases is that it&#8217;s easy: many databases provide an automatic facility for assigning incrementing numeric keys to a table.</p>
<p>I won&#8217;t even go near the debate of synthetic keys vs. natural keys for persistence, so I won&#8217;t count that for or against incrementing numeric keys.</p>
<p><b>Can&#8217;t be easily distributed</b><br />
A huge flaw in incrementing numeric keys is that their assignment is, by definition, not distributable without taking special steps. Even if the database in question is assigning the keys for you, all new records must be inserted into the same instance of the database to ensure duplicates are not assigned.</p>
<p>Incrementing numeric keys also don&#8217;t shard well, as their distribution is completely predictable, causing some problems with large datasets that are avoided with other more easily hashable types of keys.</p>
<p><b>Single point of contention</b><br />
Whether it be a database or the infamous &#8220;next number table&#8221;, the place where the key is assigned with an incrementing key is a point of contention, something any scalable system wants to avoid at all costs.</p>
<p><b>Implies order, but doesn&#8217;t make it explicit</b><br />
An incrementing key implies sequence or order, but does not explicitly define order. That is, if you sort the entries by the key, they are (supposedly) in order of insertion. The key isn&#8217;t meant for this, however, and it is far better to use an actual meaningful order, such as a created date/time stamp or other sequence that actually has a purpose in the domain model.</p>
<p><b>Temptation to assign meaning</b><br />
The most insidious problem with an incrementing numeric key is that it is often too tempting to assign meaning to the key, where none actually exists. If the key is supposed to be a meaningless unique identifier, then use a meaningless unique identifier, not an easily recognized incrementing value that will tempt users of the data (if you expose the key &#8211; arguably a second problem) to infer &#8220;ah, this number is greater than 1000, therefore this record was inserted after June the 12th&#8221;, or, worse, setting a new starting point in the number for a certain class of data &#8220;keys greater than 1000000 are created by customers, less than 1000000 are created by the system&#8221;. I&#8217;ve seen every kind of crazy variation of this kind of thing, and they&#8217;re all trouble, and none of them could have happened with a nice incomprehensible GUID instead of an incrementing numeric id.</p>
<p><b>Fixed Size</b><br />
While it is entirely possible to set an incrementing numeric key to a reasonable size, allow for all possible expansion, often this is not done. This will frequently lead to that horrible moment in the future where you realize the data set is going to scale much larger than was anticipated, and the key must be resized. Often this happens as a result of a bizarre tendency to want to &#8220;save space&#8221; by trading off a couple of bytes.</p>
<p>To my mind, any <i>one</i> of these disadvantages is enough to avoid the trap of the incrementing numeric key, but all of them put together add up to a pattern I&#8217;d most certainly avoid unless there were highly compelling reasons for it, and &#8220;because it&#8217;s easy&#8221; is not a compelling enough reason.</p>
<p><b>Alternatives</b><br />
Not one to complain of a problem without presenting some possible solutions, I thought I&#8217;d quickly mention some of the alternatives to the numeric incrementing key:</p>
<p><b>Natural Key</b><br />
One way to avoid a synthetic key using a numeric incrementing value is to avoid a synthetic key altogether. If you use a naturally unique value from the data in question as the key, the entire problem goes away &#8211; although, arguably, other problems can arise, but these are outside the scope of this post.</p>
<p><b>GUID or ObjectId</b><br />
My favorite solution to the key issue, if a natural key is for some reason not appropriate, is to use a randomly-generated GUID. In the bson libraries supplied with MongoDB&#8217;s driver, the ObjectId value is such a GUID, and generates a 24-character random hexidecimal key. The chance of a collision is vanishingly small, and this value makes an excellent key for both key-value and relational databases.</p>
<p>Java&#8217;s own libraries also provides a GUID-generator, also suitable for the same purpose.</p>
<p><b>Conclusion</b><br />
I hope this article gives you a bit of food for thought about numeric incrementing keys, and I encourage you to consider the available alternatives next time this issue comes up in your design work.</p>
]]></content:encoded>
			<wfw:commentRss>http://php.jglobal.com/blog/?feed=rss2&amp;p=1331</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Ad-Hoc Reporting Problem</title>
		<link>http://php.jglobal.com/blog/?p=1309</link>
		<comments>http://php.jglobal.com/blog/?p=1309#comments</comments>
		<pubDate>Sun, 11 Mar 2012 03:57:52 +0000</pubDate>
		<dc:creator>Mike</dc:creator>
				<category><![CDATA[Agile]]></category>
		<category><![CDATA[Rant]]></category>
		<category><![CDATA[Software Craftsmanship]]></category>
		<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://php.jglobal.com/blog/?p=1309</guid>
		<description><![CDATA[I&#8217;ve recently had the opportunity to consider the impact on a service-oriented/event-driven architecture of the requirement for &#8220;ad-hoc reporting&#8221;, and I&#8217;d like to share some of my reasoning and conclusions here.
In the last many years, I&#8217;ve built systems that consist of a set of services. These services are used by a web UI and various [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently had the opportunity to consider the impact on a service-oriented/event-driven architecture of the requirement for &#8220;ad-hoc reporting&#8221;, and I&#8217;d like to share some of my reasoning and conclusions here.</p>
<p>In the last many years, I&#8217;ve built systems that consist of a set of services. These services are used by a web UI and various other client applications to interact with users, load data, process that data in various ways, and export some of that data to other systems.</p>
<p>A fundamental goal of using a service-oriented architecture is to keep the cost of new changes as low as possible. That&#8217;s not the same as going as fast as possible, but it does often result in a high and sustainable pace of change as well. The advantage of this is that it allows the whole organization to respond to changes in the market quickly and without excessive cost. </p>
<p>That cost is broken into two parts &#8211; the apparent costs, which is the developer and operations time costs to implement and deploy a new feature, and the non-apparent costs, which are the accumulating technical debt of each change. The apparent costs are easy to measure: how many hours, days, weeks did this change take, and how many developers and other staff did I pay to do the change.</p>
<p>The non-apparent costs, relating to technical debt, are much less obvious. This cost is not directly measured in dollars at first &#8211; it is measured in dollars later. The best way to see it is to consider the cost difference per feature over time. Let me use an example to illustrate: Let&#8217;s say that we&#8217;re good at breaking up feature requests into approximately equal-sized pieces, and that the average cost of a feature with a given team is, say $100. If we&#8217;re accumulating no technical debt at all in a given sub-system, then our cost per new feature remains about $100 (not taking into consideration any changes in the cost of talent over time for the moment). </p>
<p>If, on the other hand, we are accumulating technical debt, then the cost per feature over a few months might rise to, say, $110. Then in another couple of months is $120, then $140, and so on. Each new feature is only worth so much to our business, to look at the other side of the equation. Let&#8217;s say, for the sake of argument, that a given new feature is worth $150 per month more in revenue. Simple economics tells us then that paying $150 for that feature allows it to pay for itself in a single month &#8211; so it makes sense to write it, even if the cost for that feature has risen to $150. </p>
<p>Unfortunately, it&#8217;s not this simple: the revenue from a feature is not easy to measure or estimate, and it does not remain constant over time &#8211; a shiny new feature might net us $150 more the first month, then $140, then $120… until the shine is gone and it&#8217;s not contributing to our product at all (arguably, that&#8217;s the time to take it out again, but that also costs money &#8211; and it&#8217;s hard to know).</p>
<p>It also matters, often a lot, WHEN a new feature is released. We might gain $150 profit from a feature if we release the month before our competition, but it might be worth much less after everyone else has it. In fact, we might still need our new feature just to keep up with the joneses &#8211; we might have to pay to develop it despite no increase in profit, we might need it just to avoid losing customers to the competition.</p>
<p>We can make a number of observations here. It is clearly desirable to get each new feature for as low a cost as is sustainable &#8211; that is, we&#8217;d like to keep paying about the same per feature over time. At the same time, we have to carefully select where we spend our money per feature so we have the right feature at the right time in order to maximize our revenue &#8211; this means the time to implement a feature is as important as its cost, sometimes more so. We might be willing to pay more to have a feature sooner, and we must be able to make that decision.</p>
<p>On any given system, it is common for the cost per feature to rise over time. This is due to a number of factors, but a critical one is the underlying design of the system, its level of coupling and cohesion, its indépendance from other systems, etc. The RATE of the increase over time is the most useful measure of the effectiveness of the design &#8211; if it&#8217;s increasing sharply, we may hit what I&#8217;ve called previously a point of &#8220;technical debt bankruptcy&#8221;, where its not financially responsible to add new features, as we can&#8217;t possibly make enough money from them to make it worth the ever-increasing costs.</p>
<p>Increasing costs also make the order of features more critical &#8211; if we get a less important feature done before a more important one, we&#8217;ve increased the cost for the more important one, hastening our collision with the bankruptcy point all the more.</p>
<p>I go through all of this in order to highlight the importance of being able to produce new features sustainably, at a predictable pace and at either a constant or a slowly-increasing cost. The delta over time of the average cost per feature is one good metric of the technical debt &#8211; often, some non-feature work and refactoring can in fact reduce that delta, at least for a while, so we sometimes see a pattern of bursts of new features, then a slower period where we &#8220;pay down&#8221; some of that debt, then go again, without letting it get out of hand. This is also a sustainable pattern.</p>
<p>Creating new features more rapidly, with less thought to design and architecture, looks good at first &#8211; we get features more rapidly, and for an apparently low cost. The affect on the delta, however, is exponential in this situation &#8211; the next feature doesn&#8217;t cost a bit more, it costs a LOT more, as we&#8217;re having to be constrained by the decisions we made in a hurry last time. That&#8217;s not to say that it&#8217;s not sometimes the right decision to go fast, and to acknowledge that we&#8217;re accumulating technical debt. </p>
<p>The mistake that I&#8217;ve seen over and over again, however, is to disavow the existence of the delta, of the cost of technical debt entirely. To not take it into account when making decisions is to make decisions without all the facts.</p>
<p>One of these decisions is the difference in importance between various design and architecture decisions, such as service-oriented architectures and event-driven architectures. One of the primary features of a well-crafted SOA/EDA is the ability to decouple systems &#8211; by this I mean that changes and new features in one system do not necessitate changes and impact on another. You can change your customer-relation-management system and its features without breaking (or even affecting) your shopping-cart system, for instance. This means that the overall cost of the change to your CRM is lower, as you don&#8217;t need to take into account the fact that you&#8217;ll break the shopping-cart &#8211; or any other system.</p>
<p>One of the ways that SOA/EDA creates this advantage is by specifying the communication between systems &#8211; the CRM, for instance, might respond to events from the shopping cart when customers buy things. It has no idea how the shopping cart works, or what its process for selling is, it only knows the well-defined API between it and the cart, whether this be events, REST calls, or some other mechanism &#8211; the main thing is that one system does not &#8211; in fact, must not &#8211; know the &#8220;internals&#8221; of the other. </p>
<p>This is true in good system design in general, not just in a SOA architecture. When designing with objects, for instance, there is the concept of &#8220;information hiding&#8221;, where the internal details of the implementation of even a single object are not exposed carelessly to the outside world. They are, instead, exposed only through carefully crafted interfaces, e.g. an API. The idea scales up to entire services just the same.</p>
<p>So, you might by now ask, what does this have to do with ad-hoc reporting? </p>
<p>Ad-hoc reporting has traditionally relied on access directly to a persistent store of some kind, typically a relational database, to be able to produce reports and replies to queries that were not anticipated when the system was built (hence the ad-hoc part). In order to do this, and to formulate a meaningful report, the report creator has to understand the meaning of the stored data, especially how it relates to other stored data, so that sets of data can be joined. This means that the domain design must be represented not in the logic of the application, but in the schema of the database. Not just the logic of one domain, or one service, either, but the logic of any that you wish to join together for ad-hoc purposes.</p>
<p>Good service-oriented architecture that relies on an API treats persistence as an internal implementation detail of the service &#8211; e.g. it&#8217;s the service&#8217;s business, and its alone, how it manages to store its data. Whether it&#8217;s in a database, a file, or in some kind of magical cloud storage doesn&#8217;t matter at all &#8211; and MUST not matter at all &#8211; to the clients of the service, or we&#8217;ve lost a key advantage.</p>
<p>But what is to stop a service from simply storing its data in a manner that can in fact later be used by ad-hoc tools? Where&#8217;s the harm, you might ask?</p>
<p>The harm is what brings us back to the discussion of technical debt again. </p>
<p>Let&#8217;s say we design our CRM service to store its data in a nice relational database, so we can support ad-hoc queries later. Now we leave the CRM service and go to the shopping cart service &#8211; same thing, we should store our data in a nice normalized schema, so we can later come along with SQL and do our queries. </p>
<p>Right away we&#8217;ve made some fatal assumptions: the domain model must now be entity-based, in order for ad-hoc queries to be meaningful. What I mean is this: Let&#8217;s say in our shopping cart we decided to create our customer domain object from a series of &#8220;customer-edit&#8221; events. Our service stores the events, and uses the CQRS pattern to create the customer domain object from the event entities. How do we get a list of customers from our ad-hoc reporting tool? Well, we&#8217;ve got two choices &#8211; either create the customer in some kind of view, essentially duplicating the logic that creates the customer domain object in the service, or don&#8217;t store the customer that way &#8211; store the customer as the completed domain object, or a relational representation of that object (as a customer is rarely &#8220;flat&#8221; in practice).</p>
<p>Now we&#8217;re seeing where the decision to design for ad-hoc reporting is skewing our service design &#8211; it&#8217;s making us consider doing things in a non-optimal manner in order to keep the world in a state where ad-hoc reporting can operate, and causing us to &#8220;bleed&#8221; domain knowledge from our service into a database schema.</p>
<p>Let&#8217;s say we&#8217;re OK with that decision, that we don&#8217;t mind losing a bit of design freedom in order to have our reporting choices. </p>
<p>Now let&#8217;s consider the CRM service. It also deals with customers, and might also have used some kind of event architecture to do its work. Let&#8217;s put it in a relational database as well, and structure its schema so we can do our reporting. It is very probable that the customer we&#8217;re dealing with in the shopping cart is the same customer (conceptually) as we&#8217;re dealing with in the CRM. Ok, then we&#8217;d like to be able to join between these two elements, and produce ad-hoc reports. This means that the CRM system must share a data definition of the key of the customer table with the shopping-cart system &#8211; e.g they must agree on data-type, and on length, and on the way the key is produced, etc.</p>
<p>Now we&#8217;ve started coupling these applications: any change to the customer key in CRM means a corresponding change to the customer key in the cart. Well, at that point, why don&#8217;t we just merge the two tables and have just a single definition of customer? Wouldn&#8217;t that be simpler? At first blush, it certainly seems so. Ok, now we&#8217;ve coupled a bit further, essentially making CRM and shopping cart into a single system. Where&#8217;s the harm? This brings us back to technical debt. A change to the CRM now means we must consider the impact of that change on the shopping cart, increasing the cost of that change. How much? It&#8217;s hard to say, it depends on the change, but this is a simple example &#8211; this pattern leads the way to the &#8220;big ball of mud&#8221; design, where you essentially have a single database for all services, at which point the temptation to simply use the database as a means of communications between applications becomes too great &#8211; and once you&#8217;ve done this, you don&#8217;t have lots of small applications, you just have a single application, one giant monolith that is nearly impossible to maintain, with exponentially increasing costs per change.</p>
<p>Another cost must be taken into consideration as well &#8211; when the ad-hoc reports are created, they are of course created at a point in time, and the schema of our mega-app is at a certain point. As the schema changes (and it will, albeit more slowly over time, as it becomes too expensive to consider easy schema changes), the ad-hoc reports, for which we went to all this trouble, stop working. I&#8217;ve never seen  a &#8220;one-time&#8221; report that didn&#8217;t get saved and run over and over again, and every time the schema changes, this report breaks, and the user who used the &#8220;easy&#8221; reporting tool comes screaming down to find a developer to help him fix his report.</p>
<p>What we&#8217;ve essentially done is created another application: ad-hoc reporting. It is every bit as bound to the schema as all the other apps, and requires as much, perhaps more, maintenance over time. The perceived &#8220;savings&#8221; of having users do their own reports is eventually lost altogether, and is usually a net loss, costing more than if the users simply requested reports via the APIs of the services in the first place (even though that takes developer time to create).</p>
<p>This doesn&#8217;t happen all at once, of course. An expert team of developers that moves in this direction can manage it quite well, at least for a while. When the pressure to develop more features more quickly increases, however, it&#8217;s common that less experienced developers will start to introduce subtle couplings, that go unnoticed until you try to go fast, then suddenly something breaks.</p>
<p>The logical end of this pattern is to put more and more of the &#8220;logic&#8221; of the mega-app into the database, first in the schema, by having richer and richer &#8220;domain&#8221; objects stored (for instance, adding an &#8220;account total&#8221; field to the customer because it&#8217;s easier than adding it up every time &#8211; then when the logic to add it up changes in the app, it must change in the database as well). This then leads to the madness of stored procedures and triggers, as we try to &#8220;program&#8221; the anemic domain model of the database. At this point, costs per new feature are astronomical, the bug rate is through the roof, and I usually get called as a consultant <img src='http://php.jglobal.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>There are actually other losses that I&#8217;ve not even detailed when we move towards a shared schema: The rapid and ever accelerating cost increases of a shared schema is only the most visible. The subtle costs of making your developers have to think more about each and every schema change are perhaps, in the long run, worse. For instance, if I&#8217;m working on a service and I know that the only thing that uses its persistence mechanism is that service, I can tune the persistence for maximum advantage &#8211; I don&#8217;t have to worry about all the other use-cases of that store, there aren&#8217;t any. If I want to change the schema, I don&#8217;t have to form a committee, as long as the service continues to pass its integration tests, I can change it all I want, now, right away, and in as radical a manner as I need. This is one reason that developers love post-relational database systems such as MongoDB and Cassandra. It&#8217;s almost effortless to change the schema, as it&#8217;s defined in your application code, where it ought to be. If the domain design is only the business of the application, then where else would it be? If the domain, on the other hand, is shared between many applications, or even two, then I immediately have artificial constraints on change, which increases cost and decreases velocity. Key-value/post-relational databases also avoid that horrible impedance mis-match we call object-relational mapping layers, which are again an impediment to design freedom and easy change. Schemas that are shared and are outside the application have a tendency to fossilize &#8211; they become so hard to change that in the end we don&#8217;t change them at all, and make poor choices in our applications to &#8220;work around&#8221; their inadequacies, further reducing quality, increasing cost and time.</p>
<p>What are the alternatives to this scenario? How can we build systems to support the need for reporting without dissolving into a ball of coupled goo? Are these goals mutually incompatible?</p>
<p>First, it&#8217;s a false economy to say that we&#8217;ll design for easy ad-hoc reporting to save the time for developers to write reports: As we&#8217;ve seen, if we make the system coupled and schema-transparent enough to make it easy for power users to write reports, we&#8217;ll waste far, far more developer time than if we just admit we have a requirement for reporting and put it in the backlog. To think there&#8217;s a &#8220;cheap&#8221; way to get decent reporting is to defy reality &#8211; there just plain isn&#8217;t, and we should be professional enough to admit it. </p>
<p>If we&#8217;re ok with a tightly coupled system that costs a lot to change, but that support easy ad-hoc, that&#8217;s fine, let&#8217;s make that tradeoff and be upfront about it.</p>
<p>What if we want our cake and to eat is as well, is there any middle ground? Sure there is, and it&#8217;s the same one that&#8217;s been around for decades: it&#8217;s called a data warehouse.</p>
<p>Given enough access to the APIs of the services that make up our decoupled SOA/EDA-driven system, it&#8217;s straightforward to build a simple data warehouse optimized for ad-hoc queries. What we&#8217;re really doing is building a new set of clients to our services that have a different read model &#8211; this time, the read model is optimized for reporting. We build this warehouse one of two ways: where events exist that have the data we need, we listen to those events and build, where no events exist we execute timed batch jobs that query APIs &#8211; no, not the underlying database, the API &#8211; and write the results into our warehouse.</p>
<p>The warehouse is NOT the authoritative source of the original data, and that&#8217;s a good thing. It means we can produce a consistent and meaningful view of the data, which is not possible when reading directly from our production database (as it&#8217;s always changing and may be inconsistent at any given time). We also avoid any potential performance impact on production system, or, worse, locking tables with a read lock if we&#8217;re trying to get consistency and blocking readers and writers in the production system. You can index the data warehouse to optimize reads, which might be exactly the opposite of what you want in your production data stores.</p>
<p>Often the data warehouse is done with a relational database, even if the production data stores are not. For one, this gives the freedom to choose the best persistence technology for the job when building services, which contributes to cost savings, but it also speaks to the relative maturity of tools for querying in an ad-hoc manner available for relational databases. Although some tools exist for doing the same in the post-relational world, they&#8217;re not as mature as their relational cousins.</p>
<p>Is this data warehouse technique some work? &#8211; yes, of course. Is it worth it? I believe so quite firmly, yes. This work can be done without impacting the services and features being built for production at all.</p>
<p>You get the advantage of exactly the schema you want for ad-hoc purposes, even if that schema is only loosely related to the entity designs that the source data services use. You don&#8217;t have any impact at all on the services that comprise your production system, either in design or in performance, at any time. You control and have visibility as to the amount of time, money and resources that are going in to supporting ad-hoc reporting, so you can make informed decisions on how much to spend, instead of hiding those costs where they&#8217;re just as real, but much harder to see.</p>
<p>Conclusion</p>
<p>SOA and EDA are not just the new shiny thing, the latest fad. They are a fundamentally different and better way of building systems.</p>
<p>They are techniques which are exactly opposed to the idea of shared databases, anemic domains, and externalized fossilized schemas. Attempting to shoehorn those attributes into a SOA/EDA architecture results in many, if not most, of the advantages of SOA/EDA being lost, and the resulting frankenstein is simply bad software, costing more to develop and maintain than the advantages are worth.</p>
<p>If what you want is ad-hoc reporting capability, be honest about it, call it a feature, and build it into your system without compromising your design integrity and costing far more in the medium to long term. Anything else is simply unprofessional, in my opinion.</p>
]]></content:encoded>
			<wfw:commentRss>http://php.jglobal.com/blog/?feed=rss2&amp;p=1309</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tagging ScalaTest Tests and SBT</title>
		<link>http://php.jglobal.com/blog/?p=1305</link>
		<comments>http://php.jglobal.com/blog/?p=1305#comments</comments>
		<pubDate>Sun, 29 Jan 2012 19:57:52 +0000</pubDate>
		<dc:creator>Mike</dc:creator>
				<category><![CDATA[Scala]]></category>
		<category><![CDATA[Software Testing]]></category>
		<category><![CDATA[sbt]]></category>
		<category><![CDATA[scalatest]]></category>
		<category><![CDATA[xsbt]]></category>

		<guid isPermaLink="false">http://php.jglobal.com/blog/?p=1305</guid>
		<description><![CDATA[I want to share a trick for a combination I suspect a number of people are using: SBT and ScalaTest
When you&#8217;re writing tests, you may have occasion to segregate some tests from the others &#8211; perhaps your functional tests, integration tests, or ui tests, or just tests that run more slowly than the rest of [...]]]></description>
			<content:encoded><![CDATA[<p>I want to share a trick for a combination I suspect a number of people are using: SBT and ScalaTest</p>
<p>When you&#8217;re writing tests, you may have occasion to segregate some tests from the others &#8211; perhaps your functional tests, integration tests, or ui tests, or just tests that run more slowly than the rest of your suite. </p>
<p>There are a number of ways to tackle this (one good way is with SBT sub-projects, which I&#8217;ll cover in another post), but another option is to &#8220;tag&#8221; your tests. The ScalaTest doc covers <a href="http://www.scalatest.org/user_guide/tagging_your_tests">tagging here</a>, but what&#8217;s not as clear is how you do this and still run your tests with SBT.</p>
<p>Let&#8217;s say you have tagged some tests with the tag &#8220;Ui&#8221;, so you&#8217;ve got test declarations that look like this:</p>
<p><code>describe("some thing I want to test") {<br />
  it ("should do a thing", Ui) {<br />
    ....<br />
  }<br />
}</code></p>
<p>I only want to run this test (and all other tests tagged &#8220;Ui&#8221;) in one particular job on my CI server.</p>
<p>What I need to do is pass the -l and -n options to ScalaTest, but I want to do this only when a certain system property is passed to my SBT instance. E.g. if I have a property &#8220;ui&#8221;, I can pass it to SBT with &#8220;-Dui=true&#8221;. How do I get this property to set the right option for ScalaTest so that when ui is true, ONLY my Ui-tagged tests run, and when ui is false, all the other tests EXCEPT the Ui-tagged tests run?</p>
<p>Here&#8217;s what I did in my build.sbt file:</p>
<p><code>testOptions in Test ++= (if (System.getProperty("ui", "false") == "false") Seq(Tests.Argument("-oDF"), Tests.Argument("-l"), Tests.Argument("Ui")) else Seq(Tests.Argument("-oDF"), Tests.Argument("-n"), Tests.Argument("Ui")))</code></p>
<p>What this says is that I always want the arguments &#8220;-oDF&#8221; passed to Scalatest, but that when ui is false, I want &#8220;-l Ui&#8221; and when ui is true, I want &#8220;-n Ui&#8221;. I default ui to false, so if it&#8217;s not specified at all, thats the same as false.</p>
<p>Now I can simply set a system property for my Jenkins builds, and the right tests are run at the right time.</p>
<p>I thought this might be helpful to others, as it took a bit of digging to figure out the exact syntax.</p>
<p>Note that this appears only to work with Scalatest 1.7RC1 (or later, probably), and SBT 0.11.2 or later.</p>
]]></content:encoded>
			<wfw:commentRss>http://php.jglobal.com/blog/?feed=rss2&amp;p=1305</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What&#8217;s in your Pipeline?</title>
		<link>http://php.jglobal.com/blog/?p=1303</link>
		<comments>http://php.jglobal.com/blog/?p=1303#comments</comments>
		<pubDate>Mon, 21 Nov 2011 21:37:36 +0000</pubDate>
		<dc:creator>Mike</dc:creator>
				<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://php.jglobal.com/blog/?p=1303</guid>
		<description><![CDATA[The teams I&#8217;m working with have recently had the chance to move towards a continuos delivery model, where deployments to production of our applications happen many times a day, potentially.
To do this safely, you of course need the confidence that your changes are not going to break production, so you need a set of comprehensive, [...]]]></description>
			<content:encoded><![CDATA[<p>The teams I&#8217;m working with have recently had the chance to move towards a continuos delivery model, where deployments to production of our applications happen many times a day, potentially.</p>
<p>To do this safely, you of course need the confidence that your changes are not going to break production, so you need a set of comprehensive, reliable and performant tests.</p>
<p>Organizing these tests into the proper &#8220;gauntlet&#8221; for the change to pass through has produced what we&#8217;re now calling the &#8220;development pipeline&#8221;, as it does kind of resemble a pipe with valves in it.</p>
]]></content:encoded>
			<wfw:commentRss>http://php.jglobal.com/blog/?feed=rss2&amp;p=1303</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Look Ma, No Container!</title>
		<link>http://php.jglobal.com/blog/?p=1299</link>
		<comments>http://php.jglobal.com/blog/?p=1299#comments</comments>
		<pubDate>Fri, 21 Oct 2011 19:12:38 +0000</pubDate>
		<dc:creator>Mike</dc:creator>
				<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://php.jglobal.com/blog/?p=1299</guid>
		<description><![CDATA[Recently I&#8217;ve been working on a number of projects that deploy Scala web applications to Tomcat, our container of choice. 
Now, I&#8217;ve been using Tomcat for a long time, and it&#8217;s always done a pretty good job for me. One quirk that has always existed, in one way or another, was that it is occasionally [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I&#8217;ve been working on a number of projects that deploy Scala web applications to Tomcat, our container of choice. </p>
<p>Now, I&#8217;ve been using Tomcat for a long time, and it&#8217;s always done a pretty good job for me. One quirk that has always existed, in one way or another, was that it is occasionally necessary to restart Tomcat, no matter how well-behaved and non-leaking my applications are. You can make this quite seldom with carefully tuned memory arguments to your JVM, especially including the options that allow garbage-collection and re-use of permgen space, but I&#8217;ve never been able to simply make it unnecessary to eventually restart Tomcat.</p>
<p>Sometimes, Tomcat will even go down &#8220;hard&#8221;, becoming unresponsive and unable to be killed through it&#8217;s normal admin process, which is even worse, as this makes automated redeployment unreliable and slow.</p>
<p>I don&#8217;t think any of this is particular to Tomcat, either &#8211; I&#8217;ve seen variations on the same kind of issues with a great many Java servlet containers over the years, to a lesser (Jetty) and greater (Weblogic) degree.</p>
<p>In the last while, I&#8217;ve been developing web service applications based on the execllent <a href="http://akka.io">Akka</a> framework, and <a href="http://spray.cc">Spray</a> on top of it. Spray has recently added a very thin container replacement called &#8220;Spray-Can&#8221;, which you can read about in detail on the Spray site. </p>
<p>In combination with the <abref="https://github.com/eed3si9n/sbt-assembly">SBT &#8220;assembly&#8221; plugin</a>, I&#8217;ve found it&#8217;s possible to deploy my apps in an entirely different way. Although there is, technically, still a container, it feels quite containerless in practice.</p>
<p>I build my application into a single executable JAR file, as opposed to the .war format so familiar to Java developers.</p>
<p>When this jar is run, my application becomes available on the configured port (8080 by default, but whatever I set it to, even at runtime).</p>
<p>This allows me to create a very simple script to keep my app running at all times, and to easily facilitate downtime-free upgrades, even on a single machine. Let me detail this&#8230;</p>
<p>My /etc/init.d script simply runs my executable jar file in a loop: e.g. something like this:</p>
<pre>
while true
do
  java -jar /apps/dir/myapp*.jar
done
</pre>
<p>Then I place the initial version, say 1.0, in the /apps/dir and run my script. Up pops my app.</p>
<p>Now I have another /etc/init.d start a second copy of the app, either on the same machine on a different port or on a second machine entirely. (I&#8217;ve found the lower memory requirements of this allow at least 3 copies on a single machine, though, compared with a full container). </p>
<p>Now I use something like Apache with mod_rewrite/mod_proxy, or balance or, my personal favorite, pound on the front end, so the concern of where my app is visible as far as URLs is entirely removed from the app itself. The application itself comes up on the context root &#8220;/&#8221;, on, say, port 8080 and 8081. Then my pound config simply redirects URLs of the form /url/to/app to one of these two ports, with failover/load balancing. My app neither knows nor cares it&#8217;s on the URL /url/to/app, that&#8217;s pound&#8217;s concern. It can even be behind HTTPS for all it cares &#8211; pound (or apache) takes care of that too.</p>
<p>Now when I want to deploy, I simply copy a new jar into the proper location in /apps/dir, e.g. myapp-2.0.jar.</p>
<p>Now I send a &#8220;poison pill&#8221; to my app in the form of a URL that causes it to shut down. My pound router filters out this call from outside the local network, so client&#8217;s can&#8217;t send it, and I send it directly to the running app (e.g. on port 8080 or 8081). This causes the app to terminate and exit completely, cleaning up any connections it needs to on the way. The VM terminates, and the shell script immediately starts up the new version on the same port.</p>
<p>All this time the second node on 8081 is still going strong with the old version, of course, so the clients never notice the bump.</p>
<p>Now I do the same to node two. When I say &#8220;I&#8221; here, I&#8217;m of course referring to my trusty Jenkins server, which is doing all this for me automatically.</p>
<p>Now I have all of the advantages of my app, in a highly cluster-able lightweight fashion, without using a servlet container at all.</p>
<p>I used to do something similar with the lightweight winstone server (which Jenkins itself uses, incidentally), but for Spray apps, the servlet API is something I don&#8217;t need, and I can run a much higher performance multi-core Akka stack instead, with all the same &#8220;containerless&#8221; advantages.</p>
<p>I recommend you have a look!</p>
]]></content:encoded>
			<wfw:commentRss>http://php.jglobal.com/blog/?feed=rss2&amp;p=1299</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Principles and Practices: Solving the Right Problem</title>
		<link>http://php.jglobal.com/blog/?p=1198</link>
		<comments>http://php.jglobal.com/blog/?p=1198#comments</comments>
		<pubDate>Fri, 01 Jul 2011 15:54:49 +0000</pubDate>
		<dc:creator>Mike</dc:creator>
				<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://php.jglobal.com/blog/?p=1198</guid>
		<description><![CDATA[This is a further post in the series I started about the Principles and Practices of Software Craftsmen.
In this post I&#8217;d like to explore a specific principle that flows pretty readily into a practice that I think is very common among software craftsmen: 
Solving the Right Problem
Craftsmen often have a somewhat different perspective on what [...]]]></description>
			<content:encoded><![CDATA[<p>This is a further post in <a href="http://php.jglobal.com/blog/?p=1037">the series I started</a> about the Principles and Practices of Software Craftsmen.</p>
<p>In this post I&#8217;d like to explore a specific principle that flows pretty readily into a practice that I think is very common among software craftsmen: </p>
<p><b>Solving the Right Problem</b><br />
Craftsmen often have a somewhat different perspective on what they do as developers. They understand enough of how IT fits into the business picture to be aware of what is really needed when given a project or even a single story. </p>
<p>Part of the value of an experienced professional in software development is their ability to not only perform the exact task they&#8217;re set, but also to help determine if the problem is being solved in the right way. Feature definition is not just a business analyst or customer proxy rattling off a list of acceptance criteria, it&#8217;s a give-and-take negotiation. Part of that negotiation is cost-based &#8211; the developer can give detailed estimates on specific parts of a story, and the customer proxy can decide to change parts of the story to maximize the value they get for the effort expended.</p>
<p>In addition, however, the experienced craftsman has to have a context of the overall project, and the business advantage that is expected from the development effort. He can then blend this with the technical understanding of *how* it&#8217;s going to be done to help figure out if there is a better (e.g. faster, easier, higher quality, more maintainable) way to do accomplish the same thing.</p>
<p>Even beyond finding a better way, the true craftsman can determine if the problem being addressed is even the right problem.</p>
<p>Generally, the user has a good idea as to what they want to accomplish, but not necessarily how they want to accomplish it, or how the features for accomplishing it might be turned into actual working software. This is where your craft comes into play, helping to “translate” the users desires into actual requirements that make the best sense given the context of the existing system of what&#8217;s possible.</p>
<p>Sufficient experience often allows a craftsman to look at the requirements user is putting forward and to understand the underlying need for functionality. This may be very different than what you&#8217;re actually being told, and to gain experience is the guide for the craftsman to be able to steer the user from what they think they want to what they actually want. This is not at all the same as telling the user that he&#8217;s “wrong”, It&#8217;s more a matter of being technique being able to communicate with the user at their own level, and to help them explore the problem at hand more thoroughly in the context of what&#8217;s possible with software and automation, to arrive at a superior definition of the problem to be solved.</p>
<p>This is what I like to call “solving the right problems”, and it is generally a core principle of software press craftsmanship.</p>
<p><b>Don&#8217;t Grab the Wheel</b></p>
<p>One way to block this process, and one which I have seen happen frequently in actual practice, is for the customer, customer proxy, or other management, to oversteer technical choices that are not entirely within their domain.</p>
<p>Sometimes this is a trust issue, in the sense that management may not trust the developers, in this case the craftsman, to make the correct choices. Sometimes it is out of a misguided sense of fear, which manifests itself in a desire to have influence over choices that management feel would be “safer”. Frequently, this means selecting and &#8220;standardizing” on obsolete, unsuitable, and unproductive technologies and techniques. </p>
<p>The irony, of course, is that in the process of specifying these unsuitable technologies and techniques, the overall risk in the project goes up dramatically.</p>
<p>Software is unique and a bit strange in the fact that it&#8217;s the one industry where the customer generally tells the vendor not only what problem they want solved, but how to solve it, and frequently even with what tools to solve it with.</p>
<p>It is somewhat like going to your doctor with a tummyache and telling him &#8220;I want my arm amputated, and I want you to use this spoon to do it with&#8221;. Oh, and I want it immediately and I want to pay 4c for the service, and it better not hurt. Or going to a master carpenter and telling him &#8220;here&#8217;s a chunk of pine, make me a mahogany bannister for my staircase, and here&#8217;s a blunt butterknife to do the carving with&#8221;. While both of these sound absurd, they are no less absurd than hiring a software craftsman and telling him he must develop on a laptop using PHP and MS-Access and deploy on a five-year-old Windows 95 box with insufficient memory. (Not knocking any specific technology here, just pointing out that it&#8217;s bad to not make use of your developers expertise).</p>
<p>To get the best out of your craftsman, let him (or her, as I keep pointing out) do the driving. There may be necessary constraints of course, so explain them &#8211; but don&#8217;t grab the wheel out of his hands.</p>
]]></content:encoded>
			<wfw:commentRss>http://php.jglobal.com/blog/?feed=rss2&amp;p=1198</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Extending ScalatraFunSuite</title>
		<link>http://php.jglobal.com/blog/?p=1233</link>
		<comments>http://php.jglobal.com/blog/?p=1233#comments</comments>
		<pubDate>Thu, 28 Apr 2011 23:36:32 +0000</pubDate>
		<dc:creator>Mike</dc:creator>
				<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://php.jglobal.com/blog/?p=1233</guid>
		<description><![CDATA[I&#8217;ll keep this one very short and to the point: I&#8217;ve been working extensively with Scalatra and Scalate recently, and it&#8217;s been great. My only minor problem has been with the bundles testing capabilities with Scalatra, namely the ScalatraFunSuite.
It lets you very easily test your Scalatra filters or servlets, but the number of things you [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ll keep this one very short and to the point: I&#8217;ve been working extensively with Scalatra and Scalate recently, and it&#8217;s been great. My only minor problem has been with the bundles testing capabilities with Scalatra, namely the ScalatraFunSuite.</p>
<p>It lets you very easily test your Scalatra filters or servlets, but the number of things you can assert on is a bit limited. You can check status (to ensure you get a 200 instead of a 404 or a 500, for instance), and you can assert things about the contents of the response body, such as if it contains an expected string. All well and good, but not enough for my test-driven proclivities.</p>
<p>So I did a bit of tinkering, and came up with a nice clean way to grab things such as cookies, response values being passed to the template, the name of the Scalate template being called, and so forth.</p>
<p>I want to be able to write a test like this:</p>
<pre class="brush:scala">
test("edit the organization data") {
    get(ORGANIZATION_URL) {
      status should equal(200)
      assert(template.endsWith(OrganizationController.ORGANIZATION_TEMPLATE))
    }
  }
</pre>
<p>For instance, where I assert that the template that got called was a certain template that I expect.</p>
<p>It turns out this isn&#8217;t very hard &#8211; just stick another Filter in the chain when you set up your test, like so:</p>
<pre class="brush:scala">
   addFilter(classOf[TestFilter], "/*")
   addFilter(new CommonFilter, "/*")
</pre>
<p>And yes, I am using a different method invokation on the second line &#8211; I do this intentionally, so I can control creation of the servlet class under test, as I use Guice-servlet to inject all it&#8217;s service dependencies, and at test-time, I do this with implicit parameter &#8211; but that&#8217;s a whole &#8216;nother post <img src='http://php.jglobal.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Anyway, TestFilter, as you might imagine, gives me &#8220;hooks&#8221; that I can assert on later. It looks like this:</p>
<pre class="brush:scala">
object TestFilter {
  var lastURI: String = null
  var values: Map[String, AnyRef] = Map()
  def template = values("scalateTemplates").asInstanceOf[List[String]].head
  val cookies = ListBuffer[Cookie]()
}

class TestFilter extends Filter {

  def destroy() {}

  def init(conf: FilterConfig) {}

  override def doFilter(request: ServletRequest, response: ServletResponse, filterChain: FilterChain) {
    val wrappedResponse = new ResponseWrapper(response.asInstanceOf[HttpServletResponse])

    val httpRequest = request.asInstanceOf[HttpServletRequest]
    lastURI = httpRequest.getRequestURI
    filterChain.doFilter(request, wrappedResponse);
    val names = httpRequest.getAttributeNames
    while(names.hasMoreElements) {
      val name = names.nextElement.asInstanceOf[String]
      val value = httpRequest.getAttribute(name)
      println(name + ":" + value)
      values += (name -> value)
    }
  }

  class ResponseWrapper(val response: HttpServletResponse) extends HttpServletResponseWrapper(response) {
    override def addCookie(cookie: Cookie) {
      super.addCookie(cookie)
      cookies += cookie
    }
  }
}
</pre>
<p>That&#8217;s all there is to it &#8211; now I can call my servlet under test just like in the first code snippet, and assert on values being sent to the template, on the name of the template, on cookies, and so forth.</p>
<p>That&#8217;s it! Hope this turns out to be useful to some other Scalatra aficionados out there!</p>
]]></content:encoded>
			<wfw:commentRss>http://php.jglobal.com/blog/?feed=rss2&amp;p=1233</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What&#8217;s in your Stack?</title>
		<link>http://php.jglobal.com/blog/?p=1215</link>
		<comments>http://php.jglobal.com/blog/?p=1215#comments</comments>
		<pubDate>Thu, 31 Mar 2011 14:15:19 +0000</pubDate>
		<dc:creator>Mike</dc:creator>
				<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://php.jglobal.com/blog/?p=1215</guid>
		<description><![CDATA[As most of the readers here probably already know, the term &#8220;stack&#8221; when it comes to app development is typically meant to describe the entire software structure on which the application is built. For instance, LAMP is a common example, standing for Linux, Apache, MySQL and PHP.
Developers are often tweaking and re-selecting pieces of their [...]]]></description>
			<content:encoded><![CDATA[<p>As most of the readers here probably already know, the term &#8220;stack&#8221; when it comes to app development is typically meant to describe the entire software structure on which the application is built. For instance, LAMP is a common example, standing for Linux, Apache, MySQL and PHP.</p>
<p>Developers are often tweaking and re-selecting pieces of their stack &#8211; the newest thing comes along, we like to try it out. But usually we&#8217;ve got a &#8220;go-to&#8221; stack for getting real work done, and constrain our experiments to a separate area until we&#8217;re sure something has what it takes to get included in our favorite stack.</p>
<p>There&#8217;s usually two major types of components in our stack &#8211; the bits we actually deploy, and the bits we use to actually develop with &#8211; the latter includes pieces we usually don&#8217;t actually ship with the finished product.</p>
<p>For instance, a classic J2EE stack might consist of the &#8220;deployable&#8221; pieces: Oracle&#8217;s standard Java EE 6, Hibernate, JSPs (technically a part of J2EE in any case), and Tomcat as a servlet container.</p>
<p>And the development stack might includes Eclipse, Maven, and various plugins to make Maven do our bidding.</p>
<p>There are a number of attributes I wanted in my own personal favored stack. This included:</p>
<p><strong>Easy management of dependencies:</strong> I like knowing and controlling exactly what bits go into my finished application, and unless I can control all of the dependencies accurately, I can&#8217;t achieve acceptable quality.</p>
<p><strong>Fast cycle-time:</strong> I want a stack that during development provides me with a rapid cycle time. E.g. the time between making an edit and seeing the results in either an application or test to be as short as possible.</p>
<p><strong>Easy testability:</strong> I need my stack to facilitate testing, as my normal mode of development is BDD/TDD, where I write code because a tests tells me to, by and large. I want my testing tools to support my choice of test frameworks as well.</p>
<p><strong>Modern language support:</strong> My choice of language for my own development is <a href="scala-lang.org">Scala</a>, so I need a stack that supports Scala well. I won&#8217;t go into my reasons for choosing Scala, that&#8217;s a topic I&#8217;ve covered <a href="http://php.jglobal.com/blog/?p=1100">in another blog post.</a></p>
<p>So, without further delay, here&#8217;s my favored stack. First, the deployable bits:</p>
<p><strong>Scala:</strong> I&#8217;m currently using Scala 2.8.1. Scala allows me to use a REPL for easy experimentation, which helps with my fast cycle-time requirement as well. The 2.9 release of Scala makes the REPL even more capable, but I haven&#8217;t tinkered with it much yet. I&#8217;ve talked at length about why I choose Scala in other posts, but it&#8217;s not for lack of knowing and having seriously tried other languages and environments.  I can&#8217;t say I&#8217;ve tried them all, but I&#8217;ve tried quite a lot, and Scala remains my top choice.</p>
<p><strong>Scalatra: </strong><a href="https://github.com/scalatra/scalatra">Scalatra</a> is a lightweight non-intrusive web app framework that lets me create web applications quickly, easily, and with maximum flexibility. I can write both RIA-style applications with REST/AJAX/COMET functionality or non-RIA request/response style applications, while maintaining statelessness and supporting scalability.</p>
<p><strong>Scalate:</strong> With strong integration with Scalatra, <a href="http://scalate.fusesource.org/">Scalate</a> is a powerful template system, perfect for generating XHTML for web applications in an extremely DRY and expressive manner. It&#8217;s like an up-to-date version of Velocity, but with Scala goodness and much more powerful.</p>
<p><strong>SBT:</strong> A long-time Maven aficionado, I was hard-pressed to consider a different build system. I was, however, honest about Maven&#8217;s shortcomings, of which there are quite a few. I tried SBT and went back to Maven several times. I was intrigued enough with the advantages to come back again for another try, though, and I&#8217;ve now mastered it enough, and am getting enough benefits, that I won&#8217;t switch back.</p>
<p><strong>Casbah:</strong>Casbah is the &#8220;missing link&#8221; between the Java MongoDB drivers and Scala. It allows database objects for Mongo to be readily constructed with Scala&#8217;s expressive map syntax, and type-safe retrieval of fields, not to mention it&#8217;s advanced query support.</p>
<p><strong>Squeryl:</strong> On the occasion I need to interact with relational databases, <a href="http://squeryl.org/">Squeryl</a> has become my tool of choice &#8211; but I&#8217;ve also dabbled with a few alternatives. One alternative I won&#8217;t go back to under any circumstances is Hibernate, or even worse, Entity EJB&#8217;s.</p>
<p><strong>OpenJDK:</strong> Open JDK is a capable and independent JVM that supports Scala well.</p>
<p><strong>MongoDB:</strong> I&#8217;ve used relational databases for several decades, but now that I&#8217;ve worked with key/value systems such as Mongo, I seldom find a job that RDBMS is better suited for.</p>
<p><strong>Winstone:</strong> The <a href="http://winstone.sourceforge.net/">Winstone</a> servlet container allows me to create a single simple executable jar that incorporates both my app and the servlet container necessary for it to run in a single super-lightweight package that is easy to deploy and manage. Given it&#8217;s light footprint, it allows me to run multiple servers on a single system and provide a cluster &#8220;in a box&#8221;.</p>
<p><strong>Linux:</strong> For deployment, I choose Linux. I&#8217;ve never found a business situation that wasn&#8217;t better served by Linux than by Windows, ever. Enough said.</p>
<p>Now the development stack: </p>
<p><strong>IntelliJ IDEA:</strong> IDE support for Scala is not fantastic yet, but it&#8217;s quite good and getting better. The best of the lot is IntelliJ IDEA with it&#8217;s Scala plugin. It&#8217;s not perfect, but it&#8217;s quite capable. I will admit I frequently find myself being tempted to use TextMate, Kod, or, more recently, Sublime instead, however, but the completion and refactoring tools keep me sticking with IntelliJ.</p>
<p><strong>SBT &#038; JRebel:</strong> <a href="http://code.google.com/p/simple-build-tool/">SBT</a>&#8217;s own support for continious deployment and testing is excellent, but when combined with the (free for Scala) JRebel tool, it&#8217;s downright unbeatable. I can work on a webapp, make code or UI changes, flip to my browser and hit refresh and see the result. I can tell SBT to keep running my tests every time they need to run, and immediately see any breaks as they occur.</p>
<p><strong>Jetty:</strong> SBT uses a built-in Jetty servlet to auto-deploy webapps for continuous development, so Jetty is part of my stack as well.</p>
<p><strong>Mac OSX:</strong> For development, I work on a Mac system much of the time. I do have a desktop Linux system, but the impressive portability and power of my MacBook Air is hard to compete with.</p>
<p>The above are my current weapons of choice for web application development. So, what&#8217;s in YOUR stack, and why?</p>
]]></content:encoded>
			<wfw:commentRss>http://php.jglobal.com/blog/?feed=rss2&amp;p=1215</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

