Cache File & Test Distribution To Nodes - Build/Test Issues

Welcome Guest! To enable all features please Login or Register.

Notification

Error

NCrunch Forum » General Support » Build/Test Issues » Cache File & Test Distribution To Nodes

Cache File & Test Distribution To Nodes

Options

Previous Topic Next Topic

DThorn		#1 Posted : Thursday, August 25, 2016 7:42:16 PM(UTC)
Rank: Newbie Groups: Registered Joined: 8/25/2016(UTC) Posts: 2 Location: United States of America Was thanked: 1 time(s) in 1 post(s)		Hi. We are using ncrunch to distribute our functional tests between up to 20 nodes with test runs being kicked off by one of 4 jenkins slaves. Each jenkins slave has 4 execution slots give or take. Execution time for a test runs anywhere from a few seconds to over 10 minutes. We currently have our solution pointing to a ncrunch.cache file outside our workspace so it is persistent between test runs and is targeted by any run kicked off on that jenkins slave. Due to each slave having 4 job threads we can potentially have 4 ncrunch.exe running at the same time, targeting the same ncrunch.cache file. The way I understand ncrunch, using the cache for long running tests (over a minute or so) nodes should be grabbing a single test as apposed to groupings of 6/7/8 in order to make best use of nodes available & distribute the long running tests. However, even after removing the ncrunch.cach, running the test suite once to generate a new one(thus removing any chance of corruption/unnecessary data), and then doing a second run to see proper distribution of tests using the previous runs generated data we still see ncrunch only using a fraction of the nodes distributing batches of 8/8/7/7/7/6/6/4 when there are 20 nodes available and the other 12 have no work. Running the suite with this distribution takes approximately an hour usually. Is there a way to get ncrunch to prefer grabbing smaller batches and using more nodes? A few versions back shortly after the ability to specify a cache location was added we were able to use it as above and see batches of 2 & 1 being grabbed and completed instead of these large batches. Then at some point it stopped distributing like that and began preferring larger groups and not utilizing all the nodes effectively. How can we adjust this? Also how is the cache file read/written to? Will there be concurrency issues caused by potentially having 4 instances of ncrunch.exe using the same cache file at the same time? Is there an issue of clobbering data? For example if one instance of ncrunch.exe starts and reads the cache file, then a 2nd one starts and reads, followed by the first one finishing & writing the data back, then the 2nd exe finishing and writing back; is there an issue with the data from the first exe being clobbered by the 2nd? I read somewhere previously but don't remember where, that execution data is categorized by test/env thus if the cache file has data for test 1 running against env 1, that data won't be considered when running test 1 against env 2. Does this env distinction also take into account solution/ncrunch/tested app version thus causing test data between ncrunch versions, or deployed tested application version being considered different data? If so is there a way to make ncrunch consider testing data based on the test alone irrespective of which environment it is running against (assuming the above information is still or ever was true)? Thank you for your time.
Back to top

User Profile View All Posts by User View Thanks

Remco		#2 Posted : Thursday, August 25, 2016 11:26:13 PM(UTC)
Rank: NCrunch Developer Groups: Administrators Joined: 4/16/2011(UTC) Posts: 7,317 Thanks: 990 times Was thanked: 1330 time(s) in 1233 post(s)		Hi, thanks for posting! DThorn;9139 wrote: The way I understand ncrunch, using the cache for long running tests (over a minute or so) nodes should be grabbing a single test as apposed to groupings of 6/7/8 in order to make best use of nodes available & distribute the long running tests. Your understanding here is correct. If NCrunch has knowledge of the normal execution times of these long running tests, it won't group them together. DThorn;9139 wrote: However, even after removing the ncrunch.cach, running the test suite once to generate a new one(thus removing any chance of corruption/unnecessary data), and then doing a second run to see proper distribution of tests using the previous runs generated data we still see ncrunch only using a fraction of the nodes distributing batches of 8/8/7/7/7/6/6/4 when there are 20 nodes available and the other 12 have no work. Running the suite with this distribution takes approximately an hour usually. Is there a way to get ncrunch to prefer grabbing smaller batches and using more nodes? A few versions back shortly after the ability to specify a cache location was added we were able to use it as above and see batches of 2 & 1 being grabbed and completed instead of these large batches. Then at some point it stopped distributing like that and began preferring larger groups and not utilizing all the nodes effectively. How can we adjust this? This doesn't seem like intentional behaviour. To my memory, there haven't been any changes in this area of NCrunch since the cache file redirection setting was introduced. Is there anything that might have changed in your environment or setup since it was last working? DThorn;9139 wrote: Also how is the cache file read/written to? Will there be concurrency issues caused by potentially having 4 instances of ncrunch.exe using the same cache file at the same time? Is there an issue of clobbering data? For example if one instance of ncrunch.exe starts and reads the cache file, then a 2nd one starts and reads, followed by the first one finishing & writing the data back, then the 2nd exe finishing and writing back; is there an issue with the data from the first exe being clobbered by the 2nd? NCrunch reads from the cache file when it initialises the engine (usually after the solution is opened), then writes it back to disk when the engine is shutting down. Reading and writing the cache file can be a time consuming process on very large solutions with high coverage density, but there are fallbacks and retry mechanisms to allow it to push through any concurrency issues. NCrunch does lock the file when it works on it, so there isn't really any risk here of the file being corrupted by concurrent use. If the file is so large that the retry mechanisms fail, then it should still contain enough relevant data to salvage sensible execution times for the next run. Where problems are possible is if you have two widely different versions of the same solution storing data inside the same cache file. For example, you may have a branch of your solution containing a different root namespace for a number of your longer running tests. If this branch shares the same cache file as your trunk or another branch, this would result in NCrunch loading cached data between the branches and losing out on the execution times of tests that are named differently. The result would be inefficient batching every time the tool ran a session between branches. If you have alternative branches or versions of your solution, it may be wise to rename your solution file to prevent any clashing. DThorn;9139 wrote: I read somewhere previously but don't remember where, that execution data is categorized by test/env thus if the cache file has data for test 1 running against env 1, that data won't be considered when running test 1 against env 2. Does this env distinction also take into account solution/ncrunch/tested app version thus causing test data between ncrunch versions, or deployed tested application version being considered different data? If so is there a way to make ncrunch consider testing data based on the test alone irrespective of which environment it is running against (assuming the above information is still or ever was true)? Test execution data is stored in the NCrunch model and cache file according to the 'Test Name'. A test name consists of the path of its parent project file made relative to its solution file (i.e. 'MyProject.Tests\MyProject.Tests.csproj') combined with an identifier generated from the test namespace, fixture, method, and parameters (in the event of a testcase). This means that it should be possible to make any environmental change to a solution without invalidating test data, provided the project is not renamed or moved relative to the solution file, and the test structure itself is not changed. This situation can become more complex if you are using NUnit's TestCaseSource attribute and generating your test parameters programmatically, as the parameters of a test must be consistent to create a consistent test name. A potential issue is that the NCrunch cache file structure is tied to the version of NCrunch that created it. If you have different versions of NCrunch using the same cache file, you'll almost certainly have problems. The cache file contains a version number encoded in one of its early bytes. If NCrunch reads a different version to what it is expecting, it will discard the cache file. Make sure you don't share a cache file between versions of NCrunch. You've mentioned that some times the test run takes just a few seconds, and some times it takes over 10 minutes. Have you managed to find any way to correlate these timings with your execution patterns? Note that if NCrunch fails to load data from its cache file due to an unexpected error, it will usually report this in the log.
Back to top

User Profile View All Posts by User View Thanks

DThorn		#3 Posted : Friday, August 26, 2016 12:13:59 AM(UTC)
Rank: Newbie Groups: Registered Joined: 8/25/2016(UTC) Posts: 2 Location: United States of America Was thanked: 1 time(s) in 1 post(s)		Thank you for the quick reply. Remco;9140 wrote: This doesn't seem like intentional behaviour. To my memory, there haven't been any changes in this area of NCrunch since the cache file redirection setting was introduced. Is there anything that might have changed in your environment or setup since it was last working? After deleting the cache file, on the 3rd run it seemed to distribute better so I believe it may have been an issue with the cache file. As you mention, there was a change in the ncrunch version without changing the cache files referenced to, so perhaps there was an issue associated with that. Remco;9140 wrote: NCrunch reads from the cache file when it initialises the engine (usually after the solution is opened), then writes it back to disk when the engine is shutting down. Reading and writing the cache file can be a time consuming process on very large solutions with high coverage density, but there are fallbacks and retry mechanisms to allow it to push through any concurrency issues. NCrunch does lock the file when it works on it, so there isn't really any risk here of the file being corrupted by concurrent use. If the file is so large that the retry mechanisms fail, then it should still contain enough relevant data to salvage sensible execution times for the next run. Where problems are possible is if you have two widely different versions of the same solution storing data inside the same cache file. For example, you may have a branch of your solution containing a different root namespace for a number of your longer running tests. If this branch shares the same cache file as your trunk or another branch, this would result in NCrunch loading cached data between the branches and losing out on the execution times of tests that are named differently. The result would be inefficient batching every time the tool ran a session between branches. If you have alternative branches or versions of your solution, it may be wise to rename your solution file to prevent any clashing. ok, so when it writes to the cache file it is only writing the new run data, thus no chance of clobbering previous data, just adding. We aren't running vastly different version with the same cache file so that should be fine. Remco;9140 wrote: Test execution data is stored in the NCrunch model and cache file according to the 'Test Name'. A test name consists of the path of its parent project file made relative to its solution file (i.e. 'MyProject.Tests\MyProject.Tests.csproj') combined with an identifier generated from the test namespace, fixture, method, and parameters (in the event of a testcase). This means that it should be possible to make any environmental change to a solution without invalidating test data, provided the project is not renamed or moved relative to the solution file, and the test structure itself is not changed. This situation can become more complex if you are using NUnit's TestCaseSource attribute and generating your test parameters programmatically, as the parameters of a test must be consistent to create a consistent test name. A potential issue is that the NCrunch cache file structure is tied to the version of NCrunch that created it. If you have different versions of NCrunch using the same cache file, you'll almost certainly have problems. The cache file contains a version number encoded in one of its early bytes. If NCrunch reads a different version to what it is expecting, it will discard the cache file. Make sure you don't share a cache file between versions of NCrunch. You've mentioned that some times the test run takes just a few seconds, and some times it takes over 10 minutes. Have you managed to find any way to correlate these timings with your execution patterns? Note that if NCrunch fails to load data from its cache file due to an unexpected error, it will usually report this in the log. It seems like the issue may have originated from upgrading version but continuing to use the old cache. Thank you for all the info and helping me to better understand how the cache is used. With test run time differing from seconds to minutes I was meaning different tests, simply outlining that our tests have wildly different execution times between them. Thank you again.
Back to top
1 user thanked DThorn for this useful post.		Remco on 8/26/2016(UTC)
User Profile View All Posts by User View Thanks

Users browsing this topic
Guest

NCrunch Forum » General Support » Build/Test Issues » Cache File & Test Distribution To Nodes

Forum Jump

You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.