Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Console farms out a excessive number of tests to a single processing thread
robjac
#1 Posted : Friday, October 11, 2024 8:03:13 AM(UTC)
Rank: Newbie

Groups: Registered
Joined: 10/10/2024(UTC)
Posts: 3
Location: United Kingdom

Thanks: 1 times
Was thanked: 1 time(s) in 1 post(s)
Hi,

We run a pack of around 6000 specflow dotnet6 regression tests daily, with a console farming out tasks to 4 grid nodes, all on Windows 2022 server VMs

On the first node that finishes building the test project, the console farms out 8 tests on 5 processing threads and then 5793 tests on the remaining thread
Until a couple of days ago, we might have seen a thread getting a max of around 30 tests but mostly they would get up to 8 tests each

We're on NCrunch v5.10
Nodes are configured for max 6 processing threads, 8 cpu cores
Console client and the 4 nodes all have dotnet sdk 6.0.427 and runtime 6.0.35

Many thanks
Remco
#2 Posted : Friday, October 11, 2024 11:37:39 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,161

Thanks: 964 times
Was thanked: 1296 time(s) in 1202 post(s)
Hi, thanks for sharing this issue.

Can you share any more details about how you're observing this? Is this using the timeline report?

Does the running of these tests work the same way when you run it from the NCrunch VS/Rider clients? Or do you see something different?
robjac
#3 Posted : Friday, October 11, 2024 2:28:28 PM(UTC)
Rank: Newbie

Groups: Registered
Joined: 10/10/2024(UTC)
Posts: 3
Location: United Kingdom

Thanks: 1 times
Was thanked: 1 time(s) in 1 post(s)
Hi thanks for getting back to me

Can you share any more details about how you're observing this? Is this using the timeline report?
-- The full test pack is executed in an azure devops pipeline so while the job is running we can observe what the nodes are doing in the Distributed Processing window in VS (2022 v17.10)

Does the running of these tests work the same way when you run it from the NCrunch VS/Rider clients? Or do you see something different?
-- If we execute on the grid from VS it looks like the tests get shared out correctly - pretty much all the tasks we can see in the processing queue are 8 tests each, or fewer, across all nodes

Please don't hesitate to ask for further details - I'm not sure what other info would be relevant/useful at this point

Many thanks
Remco
#4 Posted : Friday, October 11, 2024 10:29:17 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,161

Thanks: 964 times
Was thanked: 1296 time(s) in 1202 post(s)
Thanks for these extra details.

For some reason, the engine running in the console tool seems to think that the 5793 tests have a very low execution time, and that it must therefore be optimal to run them in one batch.

I wonder if this may be caused by cached data being used by the console tool that is somehow incorrect and not being updated. Test execution times are stored in the NCrunch cache file (the path of which should usually be provided to the tool). If the cache file is unable to be read (i.e. due to version conflicts), the execution times are pulled from a secondary .executiontimes.cache file that should be stored in the same directory.

To confirm whether timings are the cause of this problem, the easiest thing to do is to remove/move/rename both of these files. In this way, the engine won't have any existing information on test timings and it should split all tests as though they have at least a moderate execution time.

It's also worth checking to make sure that these .cache files are not stored in your VCS/git. It's possible the console tool is using a 'fresh' copy of these files when pulling down the code, and that this version of the files happens to have strange timing data (for example, you may have had a test run where almost all the tests immediately finished due to now fixed bug in the code).
1 user thanked Remco for this useful post.
robjac on 10/14/2024(UTC)
robjac
#5 Posted : Monday, October 14, 2024 10:50:20 AM(UTC)
Rank: Newbie

Groups: Registered
Joined: 10/10/2024(UTC)
Posts: 3
Location: United Kingdom

Thanks: 1 times
Was thanked: 1 time(s) in 1 post(s)

That's great, removing the cache files did the trick

It makes sense as the issue started directly following some dotnet version updates on the node machines which caused nearly all the tests to fail immediately

Thanks for your help!
1 user thanked robjac for this useful post.
Remco on 10/14/2024(UTC)
Users browsing this topic
Guest
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

YAF | YAF © 2003-2011, Yet Another Forum.NET
This page was generated in 0.037 seconds.
Trial NCrunch
Take NCrunch for a spin
Do your fingers a favour and supercharge your testing workflow
Free Download