Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

NCrunch console never exiting
ecargo
#1 Posted : Tuesday, November 17, 2020 11:23:38 PM(UTC)
Rank: Newbie

Groups: Registered
Joined: 8/17/2015(UTC)
Posts: 3
Location: New Zealand

Hi,

I'm running an NCrunch console test run, but it never completes. The following output was the tail of the logs copied from the terminal at 12:18:

Code:
[10:24:39.5298-Core-117] Checking for tasks to launch (1 out of 5 concurrent tasks are active)
[10:24:39.5298-Core-117] Stretch targets only = False
[10:24:39.5298-Core-117] Time critical tasks only = False
[10:24:39.5298-Core-117] Resources already in use = [Resources:X=, I=Debugger;Test Runner]
[10:24:39.5298-Core-117] Test filter = True
[10:24:39.5298-Core-117] Local processing allowed = True
[10:24:39.5298-Core-117] Capabilities =
[10:24:39.5298-Core-117] Publishing Event: [ProcessingQueueCapacityAvailableEvent]
[10:24:39.5298-Core-117] Event [ProcessingQueueCapacityAvailableEvent] is being published on thread CoreThread to subscriber: TestPipelineManager.☻
[10:24:39.5298-Core-117] Event [ProcessingQueueCapacityAvailableEvent] is being processed on Core thread with subscriber: TestPipelineManager.☻
[10:24:39.5338-Core-82] Event [TaskProcessedEvent] is being processed on Core thread with subscriber: StatusIndicatorSynchroniser.☻
[10:24:39.5338-Core-82] Event [NullEvent] is being processed on Core thread with subscriber: BufferedEventSubscription`1.engageTimer
[10:24:39.5528-?-80] Process with id 2424 has exited
[10:24:40.0453-Core-72] Event [♣] is being processed on Core thread with subscriber: EngineStateNotifier.☻
[10:24:40.0533-Core-72] Publishing Event: [EngineStateNotificationEvent:4 tests are queued for execution.  Monitoring 14117 tests (154 failing), with 11 tests ignored]
[10:24:40.0533-Core-72] Publishing Event: [StaticEngineEventsClearedEvent]
[10:24:40.0533-Core-72] Event [NullEvent] is being processed on Core thread with subscriber: BufferedEventSubscription`1.engageTimer
[10:24:40.4704-Core-84] Event [TestDataUpdatedEvent] is being processed on Core thread with subscriber: MetricsTreeSynchroniser.☻
[10:24:40.4704-Core-84] Event [NullEvent] is being processed on Core thread with subscriber: BufferedEventSubscription`1.engageTimer[10:26:39.5382-?-70] Publishing Event: [☼:nCrunch.Client.Model.LineMapDeallocator.♥]
[10:26:39.5382-?-70] Event [☼:nCrunch.Client.Model.LineMapDeallocator.♥] is being published on thread CoreThread to subscriber: ☻.☻
[10:26:39.5392-Core-56] Event [☼:nCrunch.Client.Model.LineMapDeallocator.♥] is being processed on Core thread with subscriber: ☻.☻


It appears that there are workers free, but 4 tests aren't being executed. I have an NCrunch timeout of 10 seconds configured, so it can't be a long-running test that's the problem. How can I figure out what's preventing these tests from running?

The timeout was output here:

Code:
[09:26:04.6333-Core-7] Default test timeout = '10000'
Remco
#2 Posted : Tuesday, November 17, 2020 11:30:45 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 6,201

Thanks: 807 times
Was thanked: 1065 time(s) in 1012 post(s)
Hi, thanks for sharing this problem.

Quote:
[10:24:39.5298-Core-117] Checking for tasks to launch (1 out of 5 concurrent tasks are active)


This particular message indicates that there is a task hanging or running well over time. It might be possible to identify the specific task by increasing your log verbosity, but the easiest option may simply be to take out the hammer: Wait for the hang, then open task manager and kill every NCrunch BuildHost or TestHost process running on the machine. When you've killed the task, the engine should in theory kick up an error and complete the run. You'll then have more detailed reporting to help identify which task is hanging.

The test timeout handling in NCrunch is pretty resilient these days, but it's still possible for test environments to circumvent it if they are particularly unstable. It's quite possible you have a test hanging in a manner that the engine can't interrupt with timeout enforcement.
ecargo
#3 Posted : Wednesday, November 18, 2020 1:31:25 AM(UTC)
Rank: Newbie

Groups: Registered
Joined: 8/17/2015(UTC)
Posts: 3
Location: New Zealand

Hi Remco,

I killed the child processes and inspected the timeline, which made it quite obvious which tests were holding up the engine. Do you have documentation on how NCrunch timeouts work so that I can try to figure out why they didn't take effect in this case?

Thanks,
Bart
Remco
#4 Posted : Wednesday, November 18, 2020 8:34:58 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 6,201

Thanks: 807 times
Was thanked: 1065 time(s) in 1012 post(s)
Hi Bart,

NCrunch enforces timeouts in two different ways:

1. ThreadAbortException: NCrunch hits a hanging test thread with ThreadAbortException, causing it to fail its execution and report a stack trace. This is not performed under .NET Core (which does not support ThreadAbortException).
2. Process Kill: If the above doesn't work within the expected timeframe or it isn't supported, the test process will send an IPC message to the NCrunch engine requesting that the engine kill the test process.

That's the theory at least. In practice, it's not so simple:

- Because of the way that async blocks are used heavily through the more modern versions of NUnit and Xunit, it can be difficult to effectively pin down all the threads that are involved in the execution of tests within the process. It the test framework is executing threads that we cannot effectively manage, a safe abort may be unknowable and impossible. The process can end up in an inconsistent or unstable state.

- At a system level, ThreadAbortException is dangerous. Threads can be interrupted doing all kinds of unexpected things. MS tried to move .NET away from ThreadAbortException for good reason. However, we still prefer it as this approach increases the likelihood that we can obtain useful information about why the test was timing out (i.e. stack trace, trace output, etc).

- Due to the mechanics of integration with the test frameworks, there are race conditions that exist with timeout enforcement that are simply impossible for us to completely eliminate.

- Regardless of which of our two approaches is being used, the test process must have a certain level of structural stability to be able to enforce a timeout. For example, a process suffering from heap corruption may be unable to make the necessary allocations to be able to send a message to the hosting engine to trigger the kill command.

In the vast majority of cases, we can safely abort a test run and return a useful result indicating where the timeout happened. But this is not guaranteed. Test code is capable of calling into unmanaged APIs which could be doing any number of unknowable things where a ThreadAbortException or Process Kill could leave things in an inconsistent state. We do our best here but perfection is impossible. I recommend making sure you still have processes in place where you can recover your systems in the event of a failed timeout enforcement, and where possible try to write your code in a way that reduces the risk of timeouts/hanging in general.
Users browsing this topic
Guest
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

YAF | YAF © 2003-2011, Yet Another Forum.NET
This page was generated in 0.070 seconds.