Comments on: Random acts of testing http://amazing-development.com/archives/2007/08/22/random-acts-of-testing/ ruby, java and the rest Tue, 06 Jan 2009 20:30:35 +0000 http://wordpress.org/?v=2.7 hourly 1 By: jkonglat http://amazing-development.com/archives/2007/08/22/random-acts-of-testing/comment-page-1/#comment-63003 jkonglat Wed, 22 Aug 2007 16:19:39 +0000 http://amazing-development.com/archives/2007/08/22/random-acts-of-testing/#comment-63003 Or, at the very least log the random generator seed so that if a test fails you can recreate the failure. Or, at the very least log the random generator seed so that if a test fails you can recreate the failure.

]]>
By: Payton Quackenbush http://amazing-development.com/archives/2007/08/22/random-acts-of-testing/comment-page-1/#comment-62998 Payton Quackenbush Wed, 22 Aug 2007 13:13:53 +0000 http://amazing-development.com/archives/2007/08/22/random-acts-of-testing/#comment-62998 Yep, that's the first requirement for reproducibility. I haven't read the book to know what is covered but here's a few from my experience: 1. Changing any ordering in a random test will change your randomness. The only way to guard against this is to set a functional coverage point so that you know the random stimulus hit the scenario you are trying to repeat. The corollary is that threaded random tests are virtually impossible to repeat, unless you are using non-interruptible threads (ie cooperative threads). 2. Random testing needs to be correlated with code coverage or functional coverage, so it can be measured. In other words, you need to know that your random stimulus is hitting all the potential cases. To be even more rigorous, you should start with zero coverage, run your random or saved-off random test suite, and then see what percentage of coverage you hit (hopefully you can get it to 100%). 2.1. In utopia, your random stimulus generator gets feedback based on the current coverage percentage, and then dynamically tweaks its randomization to target the missing coverage pieces. I've never seen this before, but you can always wish. 3. Depending on how your random stimulus works, you may be creating stimulus impossible in the real world. Thus, getting your code to correctly interpret this may be a waste of time. As a corollary, debugging random stimulus failures is usually much harder than debugging directed tests, due to random generating very bizarre cases. However this is often a good thing, as human-created directed tests are often boring and repetitive, which usually do not yield many bugs. Yep, that’s the first requirement for reproducibility. I haven’t read the book to know what is covered but here’s a few from my experience:
1. Changing any ordering in a random test will change your randomness. The only way to guard against this is to set a functional coverage point so that you know the random stimulus hit the scenario you are trying to repeat. The corollary is that threaded random tests are virtually impossible to repeat, unless you are using non-interruptible threads (ie cooperative threads).
2. Random testing needs to be correlated with code coverage or functional coverage, so it can be measured. In other words, you need to know that your random stimulus is hitting all the potential cases. To be even more rigorous, you should start with zero coverage, run your random or saved-off random test suite, and then see what percentage of coverage you hit (hopefully you can get it to 100%).
2.1. In utopia, your random stimulus generator gets feedback based on the current coverage percentage, and then dynamically tweaks its randomization to target the missing coverage pieces. I’ve never seen this before, but you can always wish.
3. Depending on how your random stimulus works, you may be creating stimulus impossible in the real world. Thus, getting your code to correctly interpret this may be a waste of time. As a corollary, debugging random stimulus failures is usually much harder than debugging directed tests, due to random generating very bizarre cases. However this is often a good thing, as human-created directed tests are often boring and repetitive, which usually do not yield many bugs.

]]>