Jay A. Patel

This is me thinking out loud. Kindly pardon the noise.

Exploring Property-based Testing

Prerequisite If you haven’t watched a presentation on property-based testing by John Hughes, please watch it – the presentation does a great job of explaining both: (i) what property-based testing is, and (ii) the power (nay, magic!) of property-based testing in finding hard-to-discover bugs.

Reid Draper, the primary author of Clojure’s test.check, describes the core idea behind property-based testing:

Instead of enumerating expected input and output for unit tests, you write properties about your function that should hold true for all inputs. This lets you write concise, powerful tests.

If I had to summarize my understanding about property-based testing frameworks, it would be as such:

  • Input data is randomly generated using composable and extensible generators.
  • Output data is validated against invariant properties of the target function.
  • A user-controlled knob can limit how extensive the testing needs to be, i.e., how many times to test the function with randomly generated data – hundreds of times, thousands of times, millions of times, etc.
  • The framework is intelligent enough to generate simple test cases first, and gradually grow the complexity of input data as early tests pass.
  • When an output violates the property invariant, the framework tries to shrink the input set to a minimal input set for which the invariant property is still violated. Often, the shrunk input will help directly pinpoint the root cause of invariant violation. When this works, one maybe reminded of Arthur C. Clarke’s famous “law”:

Any sufficiently advanced technology is indistinguishable from magic.

My Initial Take

I have never used a formal property-based testing framework before; though, I recall authoring unit tests that would validate results using randomly constructed input. The idea of using generated input seems fairly natural in the realm of testing and is likely used in many places in an ad hoc manner, i.e., without the use of composable generators. However, I’ve not authored any tests which attempt to shrink the input.

As a person made just learning about property-based testing, I have some doubts on the (i) efficacy vs. efficiency (“bang for the buck”) and (ii) wide-applicability of property-based (only) testing.

Firstly, on efficacy vs. efficiency: from developer perspective, classical unit testing (i.e., manually encoding output expectation for a given input) is easy to formulate (and grasp) as it directly validates the execution of a function. From computational perspective, classical unit testing also has the ability to hit key corner cases in a minimal number of tests; whereas, test cases based on randomly generated input data will reach corner cases in a computationally inefficient manner.

Secondly, and more importantly, it is not quite obvious to me if one could provide tightly-bound invariant properties for all functions needing test coverage. Tightly-bound invariants are required because loosely-bound invariants allow a faulty function to pass the tests. If we are unable to provide a set of tightly-bound invariants, property-based testing need to be augmented with output expectation unit tests.

Defining Tightly-bound Invariants

Connor Mendenhall presents an excellent introduction to Clojure’s property-based testing framework test.check in his blog article Check Your Work. If you have not heard of used test.check, you may gain a lot of value by reading that article before continuing on.

In the article, Connor also expresses his concerns the about wide-applicability of property-based testing. He describes his experience with using test.check framework to validate his solution to the Making Change “kata”:

After writing a few Simple-check [now renamed test.check] tests, I was stuck. I knew the specific case I was trying to describe—changing 80 cents with denominations of 25, 20, and 1 should return four 20 cent coins instead of three quarters and five pennies—but I was left scratching my head when I tried to generalize it.

I attempt to provide a solution to the problem of writing generalized test specs for the Making Change kata next.

Synopsis

A solution to the “Making Change” kata along with tightly-bound invariants for property-based tests is available on my GitHub.

Details

I describe the key parts of the test coverage next. However, please refer to core_test.clj for complete implementation details.

Firstly, let’s cover the test spec to validate correctness of change, i.e., no loss of value:

1
2
3
4
5
6
;; Change must add up -- or else, *gasp!* -- the function is committing fraud!
(defspec correct-change
         (prop/for-all [coinset gen-coinset
                        amount gen/s-pos-int]
                       (let [change (make-change coinset amount)]
                         (= amount (apply + change)))))

In the correct-change test spec above, gen-coinset is a generator that yields a random, non-empty set of integers (“coins”) each time it is invoked. The coinset is one such instance. Similarly, amount is an instance of a randomly generated positive integer. The invariant states that the change supplied by the make-change function must add up to the original amount.

Note that the correct-change spec only makes sure the change amount is correct, but it does not check minimality. Omitting either of the test specs below would provide only a loose-bound on the invariants, allowing a faulty function (for instance, one that simply returns n 1-cent coins) to pass the test.

Next, let’s discuss the test spec to validate the minimality of the simplest base case:

1
2
3
4
5
;; If the amount is one of the coins, that change must be a single coin
(defspec smallest-change
         (prop/for-all [[coinset amount] gen-amount-from-coinset]
                       (let [change (make-change coinset amount)]
                         (= 1 (count change)))))

In the smallest-change test spec, the gen-amount-from-coinset is a generator that yields a random 2-tuple, consisting of: (i) a non-empty set of coins, and (ii) an amount (that is equal in value to some coin in the set). For this type of input, the invariant states that the minimal change should be exactly one coin.

Next, we build a more “sophisticated” spec that exploits the base case along with an additional properly:

1
2
3
4
5
6
7
8
9
10
11
;; The number of coins to change amount1+amount2 together must be no more than
;; the number of coins combined from changing amount1 and amount2 seperately.
(defspec combined-change
         (prop/for-all [coinset gen-coinset
                        amount1 gen/s-pos-int
                        amount2 gen/s-pos-int]
                       ;; NOTE: changing dp-change to greedy-change fails spec
                       (let [f (partial make-change dp-change coinset)
                             change-seperate (concat (f amount1) (f amount2))
                             change-together (f (+ amount1 amount2))]
                         (<= (count change-together) (count change-seperate)))))

Combining the last two specs, using a coinset of 25-, 20-, and 1- cent coins (same as the one used by Connor in his article), we can provide change for 40-cents using two 20-cent coins. Similarly, we can provide change for 80-cents using four 20-cent coins. A greedy solution that uses two 25-cent coins, one 20-cent coin, and ten 1-cent coins (for a total of 13 coins) would fail the requirement for the target function to provide change for 80-cents using no more than 4 coins.

Conclusion

Many functions can be tightly-bound using a set of invariants. However, the key is to apply higher-order thinking than what we are used to with classical unit testing.

My concerns for wide-applicability of property-based tests are not entirely eliminated. Nevertheless, with a single short exercise, the universe of non-applicability has shrunk dramatically. I still need to use property-based tests in production code to get a better feel for their applicability and efficiency.

For anyone that is reading this, I would love to know if there is a reason that prevents the adoption of property-based (only) tests for a project. Please don’t hesitate to get in touch to discuss.

Don’t Drop the Ring

If you are a Clojure newbie (“noob”) like myself, you may run into this issue. I am only writing this since it took me a lot longer to resolve this “issue” than it should have. My hope in writing this article is to help future noobs Googling for a solution to the same problem.

As it happens, I am currently developing a web app using compojure on top of ring. As I had done a couple times before, I made an uberjar to deploy the web app as a standalone server:

1
$ lein uberjar

When I ran the jar, I was expecting the webapp to run. Instead I got a Clojure REPL prompt. Yes, I got a Clojure REPL prompt instead of my webapp deploying. No Jetty. No Webserver. Nope. Instead, a REPL prompt as follows:

1
2
3
$ java -jar webapp-0.1.0-standalone.jar
Clojure 1.6.0
user=>

I was perplexed. I tried to investigate the changes I made. Using git, I even went back to a prior commit which I knew worked. But I was unable to get the webapp to start. What the &$#!@ was going on?

As such, I unziped the jar file and took a look at the META-INF/MANIFEST.MF file. The Main-Class value was: clojure.main. Grr!

Long story short: I traced back my bash history and noticed that I had dropped the ring! Of course, the command to generate a deployable webapp is the following:

1
$ lein ring uberjar

So, fellow Clojure noobs, the lesson is: don’t drop the ring!

On Emotional Fluctuations

Over the weekend, I realized I was far too happy due to the outcome of certain NFL games. In the grand scheme of things, the outcome of NFL games should not affect me. Why was I so happy?

Over the weekend, I found myself getting excited about the prospects of my fantasy football team. And why not? Antonio Brown of the Pittsburgh Steelers was having a breakout game. As the game progressed, I was catching up to my opponent. And by the time the game was over, Brown had run for nearly two hundred yards and caught two touchdowns. I also had managed to grab a small 5-point lead in my head-to-head match-up. I was a bit giddy with excitement as my fantasy season had started 0-2. This was my chance to get my first win.

However, we still had a player each to go during Monday night’s game between the Denver Broncos and the Oakland Raiders. I had Knowshon Moreno, the Broncos running back and my opponent had Matt Prater, the Broncos kicker. I was feeling confident about a victory. Moreno had a big game the week before and was establishing himself as the primary running back for the Broncos. A single touchdown from him would seal the victory.

So happy.

What actually happened Monday night was not according to plans. Moreno didn’t get a touchdown. In fact, he was only used lightly, with a majority of touches were going to the other running backs. He was not getting the yards he was supposed to. Further, the Broncos offense was potent enough to get into kicking territory, but were failing to convert the progress into touchdowns. As a result, Matt Prater had a huge game.

I lost. Oh, so sad.

After the euphoria of seeing my fantasy team heading towards it’s first victory of the year (I started out 0-2), I realized how emotionally charged I was due to something that should not really impact me.

I need to learn to better handle the outcome of things that are not under my control.

On Delivering Customer Value

I am presently working on a pre-1.0 software product at VMware. It’s exciting time: rapid product development based on customer feedback, no support requests from customer issues in production, tremendous freedom as a standalone R&D organization, no legacy and/or compatibility issues to worry about, hopes as high as the Burj Khalifa, etc.

However, as a business, VMware obviously has plans to make revenue from our efforts. As we come close to the release milestone, some of our team members are curios about revenue expectation of our 1.0 product from upper management.

On being asked this question, the manager responsible for our product responded with the following quip:

Focus on creating the best possible customer experience and customer value, and all the money issues will just fall into place.

I think it rings true, whether you are a startup or an established player trying to grow in a new area. It’s that simple.

On the Importance of Shipping

Earlier today, our team was discussing a newly implemented feature during our bi-weekly sprint review meeting. The feature introduced a configuration knob that end users may adjust frequently. This knob was built atop our existing configuration management system, which required the software to be restarted for any change to take effect. This had been a known problem, however, as most prior configuration knobs led to infrequent adjustments, we had avoided fixing the root cause. Unfortunately, the new configuration knob now brought this limitation to the forefront.

Our team manager proclaimed that the restart limitation was “embarrassingly bad.” And, indeed, it was. However, before anyone could delve any further on this, a colleague aptly responded with the following lean development mantra:

If you are not embarrassed by the first version of your product, you’ve launched too late.

Reid Hoffman LinkedIn Founder

Without any further discussion, everyone realized that it was acceptable – maybe even expected – to be embarrassed by our first release.

On this occasion, at least, we avoided feature creep.

Hello, World!

Finally setup a Markdown-based blog (powered by Octopress).

I have never been a prolific blogger, so I don’t have high hopes of being one now. Nevertheless, one can dream.