Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

2 Pages<12
NCrunch does not honor the max timeout and runs tests forever
Remco
#21 Posted : Tuesday, October 10, 2017 5:49:13 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,177

Thanks: 968 times
Was thanked: 1298 time(s) in 1203 post(s)
Thanks for all the detailed information you've provided here, and for your effort in building the sample solution to reproduce the issue.

It's going to take a while to figure this one out, so I'm going to track this internally and work on it as time allows. I hope that the 3.12.0.3 build improves some of the performance for you in general. There are more performance improvements that I'd like to introduce in 3.12.. I think we'll just have to see how much time allows.
Remco
#22 Posted : Wednesday, October 11, 2017 1:24:05 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,177

Thanks: 968 times
Was thanked: 1298 time(s) in 1203 post(s)
I've figured this one out. The lost thread is caused by an OutOfMemoryException. I'm not sure how it's possible that this exception takes out only a thread and not the whole process, but it's the reason for the hang.

I've found a performance issue in the NCrunch runtime code that allocates more objects than it's supposed to. This means that the runner is allocating around 5 times as much memory as it should be in this use case. After fixing this issue, the runner uses a bit over 2GB for the whole 4k test run, which is still too much memory for an x86 process but it looks like the allocations are now entirely intentional and cannot be easily eliminated. I'll get you a fixed build as soon as I can. I also recommend using an x64 test process for this project.
1 user thanked Remco for this useful post.
abelb on 10/20/2017(UTC)
abelb
#23 Posted : Friday, October 20, 2017 10:17:26 AM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/12/2014(UTC)
Posts: 155
Location: Netherlands

Thanks: 19 times
Was thanked: 11 time(s) in 11 post(s)
Quote:
I also recommend using an x64 test process for this project.

Yes, but as this thread has shown, I have tested with both x64 and x86 processes. Ideally, this project is tested on both x64 and x86 (some tests have specific bitness) but that can be covered on the build server. The thing is (was) that it was impossible to run them in either x64 or x86, the first taking too much memory (more than all of the 48GB available, and hogging other application), the second hanging in x86 mode. But if I understand you correctly you have managed to resolve it!!! Great!

Quote:
the runner uses a bit over 2GB for the whole 4k test run, which is still too much memory for an x86 process but it looks like the allocations are now entirely intentional and cannot be easily eliminated

This brings in an interesting question: if 2GB is needed for about 4000 tests, that means that, even for nearly empty tests, each test requires 0.5MB per test, that seems to be a lot. It can be assumed there is some overhead involved, but so much seems to be rather out of the ordinary.

I have not seen this behavior when the tests are spread over multiple classes. In fact, over the years, I have used your product successfully with projects that have 15k or more tests, without running into this issue. It's a bit simplified to run to conclusions based on these observations, but it seems to me that when a large set is inside a single (static or no) class, the overhead becomes exponentially bigger. The obvious fix on my side would be to split the tests in multiple classes so that this problem doesn't occur in practice.

Do you have a new built that I can try or should I use the one linked earlier (I'm back from holiday and eager to test)?
Remco
#24 Posted : Friday, October 20, 2017 12:23:26 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,177

Thanks: 968 times
Was thanked: 1298 time(s) in 1203 post(s)
The 0.5MB per test in this case comes from the code coverage maps, which at the moment seem to be allocated regardless of whether assemblies are instrumented. These are basically junks of memory that need to exist on a one-to-one basis with each test in the execution run. As the amount of code under test increases, so does the size of the coverage map. Because it's being allocated as a continuous chunk, it seems to generally land on the large object heap where the memory isn't managed quite as efficiently.

I have some ideas for how the maps might be compressed down during the execution run so that they take less space. This will require a bit more looking into.

Sorry, I had hoped to have a build ready for you earlier, though there's been a few destabilising elements over the last week. I'll try to get you one as soon as I can.
abelb
#25 Posted : Friday, October 20, 2017 1:06:15 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/12/2014(UTC)
Posts: 155
Location: Netherlands

Thanks: 19 times
Was thanked: 11 time(s) in 11 post(s)
Thanks for the quick update, I understand this can take time.

So I tried installing the MSI for 2017 that you linked earlier (http://downloads.ncrunch.net/NCrunch_VS2017_3.12.0.3.msi), but it gives me an "another installation is already in progress". Possibly this is caused by me having three instances of VS2017 installed (Community, Enterprise, Ent-Preview). I will attempt the manual installation, though the message is surprising, in earlier cases, it just installed into the first-installed non-nicknamed VS version (in my case, Preview).
Remco
#26 Posted : Friday, October 20, 2017 10:36:44 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,177

Thanks: 968 times
Was thanked: 1298 time(s) in 1203 post(s)
abelb;11386 wrote:

So I tried installing the MSI for 2017 that you linked earlier (http://downloads.ncrunch.net/NCrunch_VS2017_3.12.0.3.msi), but it gives me an "another installation is already in progress". Possibly this is caused by me having three instances of VS2017 installed (Community, Enterprise, Ent-Preview). I will attempt the manual installation, though the message is surprising, in earlier cases, it just installed into the first-installed non-nicknamed VS version (in my case, Preview).


That is an interesting issue. This is actually all coming from Windows Installer. Probably your O/S has its state in a bind somehow. A reboot should usually fix something like this.
abelb
#27 Posted : Saturday, October 21, 2017 5:24:42 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/12/2014(UTC)
Posts: 155
Location: Netherlands

Thanks: 19 times
Was thanked: 11 time(s) in 11 post(s)
While this was after a restart, it was also after an update of VS 2017 Preview 15.5 P1. After a new restart, everything went fine again, probably a one-time issue.
Remco
#28 Posted : Sunday, October 22, 2017 8:51:48 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,177

Thanks: 968 times
Was thanked: 1298 time(s) in 1203 post(s)
abelb
#29 Posted : Monday, October 23, 2017 4:15:03 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/12/2014(UTC)
Posts: 155
Location: Netherlands

Thanks: 19 times
Was thanked: 11 time(s) in 11 post(s)
I've tried installing using the MSI, but that gave me the following error after starting VS 2017 again (in a messagebox):

Code:
---------------------------
Microsoft Visual Studio
---------------------------
The 'nCrunch.VSIntegration2010.CrunchPackage,
nCrunch.VSIntegration2017, Version=3.11.0.9, Culture=neutral,
PublicKeyToken=01d101bf6f3e0aea' package did not load correctly.

The problem may have been caused by a configuration change or by
the installation of another extension. You can get more information by
examining the file 
'C:\Users\XXX\AppData\Roaming\Microsoft\VisualStudio\15.0_5d0bcece\ActivityLog.xml'.

Restarting Visual Studio could help resolve this issue.

Continue to show this error message?
---------------------------
Yes   No   
---------------------------


After the same error appeared several times, I clicked "No" and it disappeared, but then in the NCrunch Tests window, the following appeared:

Code:

An exception was encountered while constructing the content of this frame.  
This information is also logged in 
"C:\Users\Abel\AppData\Roaming\Microsoft\VisualStudio\15.0_5d0bcece\ActivityLog.xml".

Exception details:
System.Runtime.InteropServices.COMException (0x80004005): Error HRESULT E_FAIL has been returned from a call to a COM component.
   at Microsoft.VisualStudio.Shell.Interop.IVsShell5.LoadPackageWithContext(Guid& packageGuid, Int32 reason, Guid& context)
   at Microsoft.VisualStudio.Platform.WindowManagement.WindowFrame.GetPackage()
   at Microsoft.VisualStudio.Platform.WindowManagement.WindowFrame.ConstructContent()


My guess? The installer did not properly de-install the latest version (the version shows 3.11, not 3.12). Or it's another issue with the side-by-side nature of VS. However, this was with the default instance, which I believe is the one the installer should target.

I'll continue to try to install manually, but wanted you to know about this nonetheless, as it may be a bug in your upcoming release.
abelb
#30 Posted : Monday, October 23, 2017 8:30:03 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/12/2014(UTC)
Posts: 155
Location: Netherlands

Thanks: 19 times
Was thanked: 11 time(s) in 11 post(s)
It turned out that the MSI installer removed the installation properly, but did not install anything in the unnamed default VS installation. It seems to me that there are quite a few confusing scenarios possible with the new side-by-side possibilities of Visual Studio 2017.

Manually installing worked, though.

Thanks very much for the updated NCrunch version. It makes a huge difference!. Memory seems down by a factor of 20-40 or so for x64 and even more for x86, which doesn't get above 350MB when tests are run in a single runner process (using the same project I sent you).

They also run much faster. Here are some of my findings, based on quotes from my previous posts in this thread:

  • that tests run on average between 30ms and 1500ms.No longer happens, avg is below 5ms
  • often freeze NCrunch, with (very) high CPU loadNo longer happens
  • the batches are now consistently below 500 tests, and memory is stable and typically below 1GB per process. Not true anymore, see screenshot, but memory usage is much lower
  • it was 4000+ tests in one batch that it takes about 15GB per process.No longer true, even a batch with 15K tests, with instrumentation, stays around 1GB in one process, 5-fold lower without instrumentation
  • If I'm right that means that the overhead per test is about 3-4MBOverhead is still significant, but doable on high-mem machines
  • In x86 the batch runs don't finish Solved!!!
  • each subsequent time more tests are combined in each batch in the processing queue. Still the case, and by design, but way too eager
  • It seems that one single process does all the work after some runs.As previous, still true
  • testrunner exceeds 18GB at some point (!!!). On subsequent runs it can go up to 35GBSolved
  • It is extremely slow, roughly 10x as slow in wall-clock timeit's still comparatively slow, but many times faster than before, first run (dry run) still takes 1-2 minutes
  • It is faster with Visual Studio 2015. Seconds vs minutes (x86 mode only)I didn't compare, but I think it is faster on VS2017 now
  • I tested with RTM builds of Visual Studio EnterpriseI tested 2017, version 15.5 Preview 1, ran fine
  • The test-result seems to actually be parsed by NCrunch, but not displayedNCrunch's GUI now reports all tests and icons seem stable


All in all, I think that is a very good improvement, thanks!

I noticed that after several runs, the memory usages increased and stood around 3.5GB for the highest test runner. Not sure what happened there.

The one issue that still remains is: the algorithm for putting tests together does not take into account that there are multiple CPUs and multiple processes available. Even when I assign 12 cores to NCrunch, it will (after the initial runs) run all tests on a single runner process. Even when setting FastLane processes to 2 or more.

It seems to make sense that NCrunch could somehow utilize all available processes. I.e., if it finds that 12,000 tests are run in P1 in the current scenario, and P2 and P3 have none assigned or have finished, it should split the assigned tests, i.e. 4,000 to P1, 4,000 to P2 and 4,000 to P3. Since their metric (they run in under 10ms per test) is roughly the same, this is a beneficial improvement. As the screenshot shows, this has a significant impact, in this scenario is would increase between 5x and 15x (depending on how many processes I assign).

See below, it clearly shows that the assessment of NCrunch is too eager after the 2nd run. Surprisingly, it sometimes flips between using 1 or 2 processes, but it should "know" that the 2nd run was much more optimized, as it ran much faster (wall clock time, not accumulated time):

abelb
#31 Posted : Monday, October 23, 2017 8:49:02 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/12/2014(UTC)
Posts: 155
Location: Netherlands

Thanks: 19 times
Was thanked: 11 time(s) in 11 post(s)
Quote:
I noticed that after several runs, the memory usages increased and stood around 3.5GB for the highest test runner. Not sure what happened there.

Actually, I can see this happening now, with the FSharp tests (set of 4196 tests). 1st run: slow, but no issue, 2nd run, fast, split over 4-6 processes, max process 1.2GB, total 6 seconds, 3rd run, slow again, takes 3.5GB in single process, runs 15 seconds.

So, while the overall memory consumption has slimmed down from 18GB to 3.5GB, it is still very much for a single process, and as mentioned above, improving the spread-over-processes algorithm would benefit this scenario both memory-wise and wall-clock runtime-wise.

Is it possible the NCrunch takes the actual CPU time of the test-batches, and not the wall-clock time of the overall run? That would make it understandable that NCrunch "thinks" that it is faster when it runs the tests in a single process (15 sec) as opposed to 6 processes (wall clock 5 seconds, but accumulated time 20+ seconds).
Remco
#32 Posted : Tuesday, October 24, 2017 12:00:19 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 7,177

Thanks: 968 times
Was thanked: 1298 time(s) in 1203 post(s)
Wow, thanks for the detailed analysis of the new build! It's awesome to see those improvements. I admit I was quite optimistic after making these changes.

The eagerness of the engine to bunch too many tests into a single task is related to how the engine tracks the execution time of each test. Right now, it only records the physical test execution time, and assigns this to each test. This means that the component responsible for allocating tests into tasks has very limited information on execution times; it doesn't know anything about the overhead involved in integration with the test framework or the logic around managing the test run. This means that it 'sees' several thousand tests with a close to zero execution time, and it concludes it can safely drop them into the same batch with no extra cost.

The nature of this project is such that the overhead of managing the test run is unusually high, so almost all the wall clock time is being spent managing the run rather than actually executing the tests. Thus the allocation of tests into tasks becomes 'stupid' and ends up underutilising the engine.

Ideally, I'd like to have the engine consider the overhead of managing the test run and factor this into the building of execution tasks. This will require some work though, so it seems to be one of those backlog items that sits on the horizon. Maybe there will be time to implement it if the platform level changes slow down a bit.
abelb
#33 Posted : Tuesday, October 24, 2017 12:06:55 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/12/2014(UTC)
Posts: 155
Location: Netherlands

Thanks: 19 times
Was thanked: 11 time(s) in 11 post(s)
Such an extension to the algorithm would be great, but I'd gather, also rather complex.

A much simpler improvement that would already reap many benefits, would be something along the following lines:

  • Take the set of X of all fast tests (i.e, that are eligible for fast-lane processing) to be run on P0, and Y1, Y2 the set of (slower) tests to be run on processors P1, P2 etc
  • Take PX to be the set of processors that are idle
  • Divide X by 1000, let this be C,
    • if C is > 1, but C < PX, then divide X by PX to create X1, X2 etc
    • If C < 1 do nothing
    • If C > 1 and C > PX, then divide X by C to create sets of 1000 tests, to be run on P1, P2 when it's their turn

  • Now you have a simple enough algorithm that will run at most 1000 tests in a single batch, unless there is not enough P's available


Simpler yet is to create a max threshold, let's say a variable MAX_FASTLANE_TESTS_IN_SINGLE_BATCH. This has some other benefits as well:

  • Users can tweak it per project
  • (probably) simpler to implement and non-disruptive for any existing project out there
  • Memory-hungry, but fast tests can now be chopped up
  • Projects with many fast tests can now split them over multiple processes (one of your main selling points to begin with)
  • Little testing required on your end, as users are in charge of tweaking
  • This can still play well when you later make this algorithmic (i.e., set to empty for algorithmic, just like other settings can be set to automatic or by-hand properties)


For performance, as you can see in the screenshot above, the benefits are huge: 5x to 10x better running time, I think this is excellent marketing for your product. And in my case, if I can use 16 of my processor cores, perhaps I can stretch this performance gain even further.

BTW: I doubt this would be a fringe setting, it may be a highly useful feature instead on any project with > 1000 tests. I've seen many projects and 1000 tests is easily reached, and quite a few have 20k or more tests (my own Exselt project has 21k fast tests, 15k slower tests, F# you know already, many other public test suites of W3 standards are 20k+, quite a few github projects have 10k+ tests).

With such additions, large-scale testing becomes a reality from within VS, where I used to let the build server do it, or (for Exselt) split the tests over multiple solutions (but that didn't solve the performance problem, very likely your fixes have benefited my other projects as well).
abelb
#34 Posted : Friday, November 3, 2017 5:29:02 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/12/2014(UTC)
Posts: 155
Location: Netherlands

Thanks: 19 times
Was thanked: 11 time(s) in 11 post(s)
This seems to have solved at least some of the problems mentioned earlier in this thread: http://forum.ncrunch.net...VS-2017-Preview-4.aspx. Thanks!
1 user thanked abelb for this useful post.
Remco on 11/3/2017(UTC)
Users browsing this topic
Guest
2 Pages<12
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

YAF | YAF © 2003-2011, Yet Another Forum.NET
This page was generated in 0.100 seconds.
Trial NCrunch
Take NCrunch for a spin
Do your fingers a favour and supercharge your testing workflow
Free Download