Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Performance enhancement for grid nodes
samholder
#1 Posted : Wednesday, January 22, 2020 1:44:36 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 5/11/2012(UTC)
Posts: 94

Thanks: 28 times
Was thanked: 12 time(s) in 12 post(s)
We have seen an issue which I think could be optimised.

We have some unbalanced grid nodes in terms of number of max concurrent processing they can do. We have some nodes that have 5 threads allowed and some that have 1 thread allowed. We have some builds that can only use the single thread nodes (UI tests mainly) and some that can use any (unit tests).

Observed behaviour is that often when we have many builds running concurrently they all seem to sync up after all tests have passed and sit there waiting for NCrunch processing step to finish. When I look at the timeline afterwards I can see that the nodes which have 5 thread capacity are generally running all the tests, but the test run is not completing until all builds have finished on all compatible agents. So even after all tests have finished, the initiaiting Console tool is waiting for the single thread nodes to complete all builds. As all builds are doing this we can often be waiting for 20 minutes, where the single thread machines are simply building all the queued projects even though all the tests finished long ago.

Would it be possible to tell all nodes to remove any unstarted builds for this test run once all tests are reported as successful? this seems like it would help with the problem. obviously in the short term we have removed the ability for these nodes to run all builds, but this seems like an optimisation that would be beneficial and without consequence.

Thoughts?
Remco
#2 Posted : Wednesday, January 22, 2020 11:18:47 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 6,974

Thanks: 929 times
Was thanked: 1256 time(s) in 1169 post(s)
Hi, thanks for sharing your thoughts on this.

This is something that we've looked into before, but decided not to change at this time for two reasons:

1. It's surprisingly complex to do. In situations where solutions contain tests that target specific nodes (i.e. using 'capabilities'), it isn't always easy to know whether a full cycle has been completed just because tests are no longer being run. The engine is extremely configurable and even under the current circumstances it's not easy to know when a test run is fully complete. This was the hardest thing to do when engineering the console tool, because it's basically an adapted continuous runner. For example, there may be a particular test in the suite that is set to run on a specific grid node under specific parameters. Although it could be possible for us to check for such a test when attempting to terminate the run, such a check could involve quite a large amount of data and would be difficult to engineer without introducing a potential performance issue. It's possible, it's just hard to do and very hard to get right.

2. There are valid scenarios (admittedly rare) where one of the slow nodes may actually fail the run due to differences in its environment. In such a case, it would be useful if the result could be reported without truncating the run. For example, a user may have a grid node running on a different version of Windows that encounters a build issue or test discovery issue not experienced on the other nodes. Some people use grid nodes for cross platform testing.

For the time being, the rational thing is to examine your NCrunch timeline and identify whether nodes are adding value to a CI run. If not, take them out. Including them in the run when they never get to report anything useful would technically be a waste of resources anyway.
samholder
#3 Posted : Thursday, January 23, 2020 2:11:25 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 5/11/2012(UTC)
Posts: 94

Thanks: 28 times
Was thanked: 12 time(s) in 12 post(s)
Thanks for the reply, yeah now we are aware of the issue then we can mitigate it. Glad to know its been considered and rejected for valid reasons.
Users browsing this topic
Guest
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

YAF | YAF © 2003-2011, Yet Another Forum.NET
This page was generated in 0.087 seconds.
Trial NCrunch
Take NCrunch for a spin
Do your fingers a favour and supercharge your testing workflow
Free Download