Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Runs distribution among NCrunch grid nodes
sheryl
#1 Posted : Saturday, March 19, 2016 4:34:02 PM(UTC)
Rank: Member

Groups: Registered
Joined: 5/15/2015(UTC)
Posts: 18
Location: United States of America

Thanks: 12 times
Was thanked: 3 time(s) in 3 post(s)
Hi Remco- our current set up has 4 NCrunch grid nodes. But when a test is triggered, noticed that, though NCrunch created snapshots in all 4 grid nodes, AT A TIME, it executes the tests in only 2 grid nodes- this can be any grid nodes and changes for each run or as available.

For e.g: If I have 100 tests, any 2 grid nodes *at a time* would show
Test node 1: Processing Tasks: Executing 8 tests in XXXXXXXXX
Test node 2: Processing Tasks: Executing 8 tests in XXXXXXXXX

once this is done, it might show as (same nodes or different nodes)

Test node 3: Processing Tasks: Executing 8 tests in XXXXXXXXX
Test node 4: Processing Tasks: Executing 8 tests in XXXXXXXXX

Why isn't it picking up say all 32 test cases and distributing it as:

Test node 1: Processing Tasks: Executing 8 tests in XXXXXXXXX
Test node 2: Processing Tasks: Executing 8 tests in XXXXXXXXX
Test node 3: Processing Tasks: Executing 8 tests in XXXXXXXXX
Test node 4: Processing Tasks: Executing 8 tests in XXXXXXXXX

What settings can I change to ensure NCrunch distributes the tests runs in all available grid nodes to optimize the run time?
Remco
#2 Posted : Saturday, March 19, 2016 10:02:30 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,123

Thanks: 957 times
Was thanked: 1287 time(s) in 1194 post(s)
Hi Sheryl,

NCrunch uses a pull-based system for distributing work across the grid. Grid nodes are responsible for requesting work from the client machine (in bulk) when they are notified that it is available.

If you have enough capacity across two grid nodes to fit the entire processing queue, then the other two nodes will likely sit idle. NCrunch won't try to 'balance' the load across the entire grid as the client always provides as much work as it can when this is requested from any node.

For example, let's say you have 4 nodes each with 8 task processors, and 100 tests, and the engine is splitting these tests into groups of 8 for execution.

This means you'll have 100/8 = 13 processing tasks.

Because each node can fit up to 8 tasks, the first node to request work will get 8 of these processing tasks. The second will get 5. The remaining two nodes will likely request work and receive nothing.

If the above example fits with your use case, then what you are experiencing is by design. It is, unfortunately, necessary for the grid to work in this way in order for the grid nodes to be shared between multiple clients (under which case one client cannot orchestrate the work for all clients).

If the above example does not fit with your use case, and you have nodes sitting idle while there is additional work in the processing queue, then we should investigate this further as the engine shouldn't be doing this.
1 user thanked Remco for this useful post.
sheryl on 3/21/2016(UTC)
sheryl
#3 Posted : Monday, March 21, 2016 5:45:01 PM(UTC)
Rank: Member

Groups: Registered
Joined: 5/15/2015(UTC)
Posts: 18
Location: United States of America

Thanks: 12 times
Was thanked: 3 time(s) in 3 post(s)
Hi Remco - thank you for the detailed reply. Just to confirm I understood it correct, the issue is , say we have 8 build Sanity test cases. Though there are 4 test grid nodes, only 1 of them picks up all the 8 test cases and runs it sequentially. This is resulting in Test execution time of around say 12-14 minutes.

If the distributed run could use all available grid nodes to run the test cases concurrently, we could reduce the execution time to say 1/4 of the current execution time. So from the above comment, there is no settings that could help me run the test cases concurrently(in this given scenario/s)?
Remco
#4 Posted : Monday, March 21, 2016 9:41:40 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,123

Thanks: 957 times
Was thanked: 1287 time(s) in 1194 post(s)
Hi Sheryl -

If the grid nodes have their capacity configured correctly, there shouldn't be a significant increase in time for the test execution, unless we're measuring very short periods of time (ms).

You'll notice that in the Processing Queue, NCrunch will tend to group tests together in batches. This is done because there is overhead involved in calling into the test framework (i.e. NUnit), so NCrunch will create small groups of tests that can run sequentially in a batch based on their execution time. If the 8 tests you've mentioned each take 50ms to execute, they'll likely be placed in the same batch and run sequentially, as this is much more efficient than trying to separate them and call into the test framework 8 times. The situation changes when the tests have a longer execution time (i.e. 4 seconds). In this case, you'll likely see more batches and less grouping, allowing more concurrency. There are other variables here, such as the duration of any test fixture setup routine, but this is the general idea.

So if you're looking at one batch of 8 fast tests in the Processing Queue, then this will always be executed on a single grid node, as it's more efficient to just run the tests together.

If you're looking at say, 4 batches of 2 slower tests each in the Processing Queue, and the first grid node has a capacity of 4 processing threads, then the first node to request work from the client will receive all 4 batches and will run them concurrently. Because the node runs the tests concurrently, there should be little overall difference when compared with other nodes picking up the other batches.

The only time you may see a significant increase in execution time with this setup is where the grid nodes have been configured for a maximum capacity that is well beyond their actual capability. For example, if you have a low-spec single-CPU VM configured with 8 max processing threads, then this node will try to concurrently execute more tasks than it has the CPU to efficiently process. This will result in a significantly higher overall execution time because the node does not have enough power to meet the demand, and the other nodes are sitting idle.
1 user thanked Remco for this useful post.
sheryl on 3/24/2016(UTC)
sheryl
#5 Posted : Thursday, March 24, 2016 4:55:01 AM(UTC)
Rank: Member

Groups: Registered
Joined: 5/15/2015(UTC)
Posts: 18
Location: United States of America

Thanks: 12 times
Was thanked: 3 time(s) in 3 post(s)
Thanks Remco.
sheryl
#6 Posted : Monday, May 9, 2016 8:14:03 PM(UTC)
Rank: Member

Groups: Registered
Joined: 5/15/2015(UTC)
Posts: 18
Location: United States of America

Thanks: 12 times
Was thanked: 3 time(s) in 3 post(s)
Hi Remco- thank you for the support and suggestions for optimizing the execution time. We got the latest version of NCrunch and ensured the cache file is saved in CI server, which resulted batching of the test cases appropriately.
1 user thanked sheryl for this useful post.
Remco on 5/9/2016(UTC)
Users browsing this topic
Guest
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

YAF | YAF © 2003-2011, Yet Another Forum.NET
This page was generated in 0.114 seconds.
Trial NCrunch
Take NCrunch for a spin
Do your fingers a favour and supercharge your testing workflow
Free Download