Hi, thanks for posting!
DThorn;9139 wrote:
The way I understand ncrunch, using the cache for long running tests (over a minute or so) nodes should be grabbing a single test as apposed to groupings of 6/7/8 in order to make best use of nodes available & distribute the long running tests.
Your understanding here is correct. If NCrunch has knowledge of the normal execution times of these long running tests, it won't group them together.
DThorn;9139 wrote:
However, even after removing the ncrunch.cach, running the test suite once to generate a new one(thus removing any chance of corruption/unnecessary data), and then doing a second run to see proper distribution of tests using the previous runs generated data we still see ncrunch only using a fraction of the nodes distributing batches of 8/8/7/7/7/6/6/4 when there are 20 nodes available and the other 12 have no work. Running the suite with this distribution takes approximately an hour usually. Is there a way to get ncrunch to prefer grabbing smaller batches and using more nodes? A few versions back shortly after the ability to specify a cache location was added we were able to use it as above and see batches of 2 & 1 being grabbed and completed instead of these large batches. Then at some point it stopped distributing like that and began preferring larger groups and not utilizing all the nodes effectively. How can we adjust this?
This doesn't seem like intentional behaviour. To my memory, there haven't been any changes in this area of NCrunch since the cache file redirection setting was introduced. Is there anything that might have changed in your environment or setup since it was last working?
DThorn;9139 wrote:
Also how is the cache file read/written to? Will there be concurrency issues caused by potentially having 4 instances of ncrunch.exe using the same cache file at the same time? Is there an issue of clobbering data? For example if one instance of ncrunch.exe starts and reads the cache file, then a 2nd one starts and reads, followed by the first one finishing & writing the data back, then the 2nd exe finishing and writing back; is there an issue with the data from the first exe being clobbered by the 2nd?
NCrunch reads from the cache file when it initialises the engine (usually after the solution is opened), then writes it back to disk when the engine is shutting down. Reading and writing the cache file can be a time consuming process on very large solutions with high coverage density, but there are fallbacks and retry mechanisms to allow it to push through any concurrency issues. NCrunch does lock the file when it works on it, so there isn't really any risk here of the file being corrupted by concurrent use. If the file is so large that the retry mechanisms fail, then it should still contain enough relevant data to salvage sensible execution times for the next run.
Where problems are possible is if you have two widely different versions of the same solution storing data inside the same cache file. For example, you may have a branch of your solution containing a different root namespace for a number of your longer running tests. If this branch shares the same cache file as your trunk or another branch, this would result in NCrunch loading cached data between the branches and losing out on the execution times of tests that are named differently. The result would be inefficient batching every time the tool ran a session between branches. If you have alternative branches or versions of your solution, it may be wise to rename your solution file to prevent any clashing.
DThorn;9139 wrote:
I read somewhere previously but don't remember where, that execution data is categorized by test/env thus if the cache file has data for test 1 running against env 1, that data won't be considered when running test 1 against env 2. Does this env distinction also take into account solution/ncrunch/tested app version thus causing test data between ncrunch versions, or deployed tested application version being considered different data? If so is there a way to make ncrunch consider testing data based on the test alone irrespective of which environment it is running against (assuming the above information is still or ever was true)?
Test execution data is stored in the NCrunch model and cache file according to the 'Test Name'. A test name consists of the path of its parent project file made relative to its solution file (i.e. 'MyProject.Tests\MyProject.Tests.csproj') combined with an identifier generated from the test namespace, fixture, method, and parameters (in the event of a testcase). This means that it should be possible to make any environmental change to a solution without invalidating test data, provided the project is not renamed or moved relative to the solution file, and the test structure itself is not changed. This situation can become more complex if you are using NUnit's TestCaseSource attribute and generating your test parameters programmatically, as the parameters of a test must be consistent to create a consistent test name.
A potential issue is that the NCrunch cache file structure is tied to the version of NCrunch that created it. If you have different versions of NCrunch using the same cache file, you'll almost certainly have problems. The cache file contains a version number encoded in one of its early bytes. If NCrunch reads a different version to what it is expecting, it will discard the cache file. Make sure you don't share a cache file between versions of NCrunch.
You've mentioned that some times the test run takes just a few seconds, and some times it takes over 10 minutes. Have you managed to find any way to correlate these timings with your execution patterns? Note that if NCrunch fails to load data from its cache file due to an unexpected error, it will usually report this in the log.