bartj;12586 wrote:
Is it only important that the test case sources return the same number of tests between executions, or do the test names also need to be identical? How about the ordering of the results from the test case source?
The ordering of tests is the critical point here. If you had a test case that showed up only intermittently, but it was the last test case in the suite, then the scope of the problem would be greatly reduced.
When it invokes NUnit to discover tests and your log verbosity is set to Detailed, NCrunch will dump a whole lot of XML into the log that details all the output of NUnit's discovery step.
If you search through several instances of this XML, you'll likely find two dumps with different data. You can then put these through a comparison tool (i.e. KDiff3) to see how they've changed. This should highlight the unstable test cases and make the problem much easier to narrow down. The key problem here is that between the discovery steps, the test's NUnit IDs are different.
We're planning to implement something to detect and report this problem. Unfortunately, it'll be a fairly simple catch that can only tell you the problem exists without telling you which tests are causing it. Examining the discovery XML is definitely the way to go.