Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

NCRUNCH 3.28 - High memory usage, frequent System.OurOfMemoryExceptions
LawrenceShort
#1 Posted : Monday, November 18, 2019 4:57:51 PM(UTC)
Rank: Newbie

Groups: Registered
Joined: 7/9/2019(UTC)
Posts: 9
Location: United Kingdom

Was thanked: 1 time(s) in 1 post(s)
Hi,

We're using NCrunch as part of our development and CI process. We're experiencing a high number of System.OutOfMemoryExceptions and quite considerable performance degradation.

The NCrunch version we're using is 3.28.

When running NCrunch will run up to the upper limit of our memory and stay there, this applies to both our development machines that have 16GB of RAM and our CI Servers and Agents that have 32GB of RAM.

Currently we've tried...
* Limiting Build Process Memory.
* Limiting Test Process Memory.
* Upgrading a developer machine and grid node to v4.1 to see if that would solve the problem.

I've identified that we have a large amount of Trace output which we're about to try limiting as this was mentioned in other older posts as a potential memory drain.

It's also been noted on other older posts that there can be a problem with large solutions with high coverage. The cache file that is generated when the solution is analysed is currently 434MB, I'm not sure if this is considered high but it feels that it may be.

We're currently at 28,821 test cases which I feel should be manageable but they are almost all end-to-end style tests. They test from the controller all the way to an in memory data store.
We use DI for injecting our objects and only Mock the areas of the application that would end up leaving our code.

Another thing that may be affecting the performance may be test naming. Each test name is written with multiple lines of text as "Given, When, Then".

We've been using NCrunch for at least the last 5 years and this has only recently became a problem. I would give anecdotal evidence that the problems began when we reached 17,000+ tests, but cannot confirm this as we also upgrade from v2.19 of NCrunch earlier this year and I am unsure as to whether this may have begun this particular issue.

Are there any suggestions you could provide with regards to this? The problems we are experiencing in the CI is currently hampering our ability to deploy code to our test environments.

Best regards

Lawrence


LawrenceShort
#2 Posted : Monday, November 18, 2019 5:00:46 PM(UTC)
Rank: Newbie

Groups: Registered
Joined: 7/9/2019(UTC)
Posts: 9
Location: United Kingdom

Was thanked: 1 time(s) in 1 post(s)
Another side note, our tests tend to run at about 1.5 sec per test. I believe this is to do with the fact that we're running the whole stack and that our DI container (castle.windsor) is scanning the assemblies to load all the objects it can find that have an interface for each test instance. Could this also be part of the issue?
LawrenceShort
#3 Posted : Monday, November 18, 2019 5:07:47 PM(UTC)
Rank: Newbie

Groups: Registered
Joined: 7/9/2019(UTC)
Posts: 9
Location: United Kingdom

Was thanked: 1 time(s) in 1 post(s)
Something else I've noticed is that restarting a developer machine tends to start NCrunch off from the beginning again, even if it is set to run impacted tests only.
Remco
#4 Posted : Tuesday, November 19, 2019 1:20:28 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 5,869

Thanks: 763 times
Was thanked: 980 time(s) in 933 post(s)
Thanks for sharing this problem.

Can you confirm where NCrunch is consuming the bulk of the memory? Is this inside the nCrunch.EngineHost process or is it a combined result of lots of nCrunch.TestHost/nCrunch.BuildHost processes that are all eating?

Can you share any further details on your current setup? How large is your solution in terms of lines of code and how many processing threads are you using?
LawrenceShort
#5 Posted : Tuesday, November 19, 2019 9:19:24 AM(UTC)
Rank: Newbie

Groups: Registered
Joined: 7/9/2019(UTC)
Posts: 9
Location: United Kingdom

Was thanked: 1 time(s) in 1 post(s)
Hi Remco,

The bulk of the memory consumption is in the nCrunch.EngineHost462.x64.exe process. The BuildHost and TestHost processes do not appear to be causing an issue.
Another, less common, exception is a Map.FileOfView error

Solution details

* .Net Framwork 4.5
* NUnit 2.6.4
* Number of projects = 22

Visual Studio analysis metrics

* 879044 lines of code, some auto-gen dbml
* Class coupling average = 6.3
* Depth of inheritance average = 1.7
* Cyclomatic Complexity average = 6.7
* Maintanability index average = 83.2

Ncrunch Metrics

* Code coverage = 73.35%
* Compiled Lines = 194,654
* Covered Lines = 142,770
* Uncovered Lines = 51,870
* Code Lines = 655,822


====================

Our developer (non-CI) setup is...
Development software

* Ncrunch client is 3.28.0.6
* Visual Studio 2017, we were trialling Visual Studio 2019 but memory issues were making it unresponsive.
* Resharper Ultimate 2019.2.2 is the other performance heavy extension in use.

NCrunch setup

* Half cores dedicated to Ncrunch (4 to 8 depending on machine age, all i7s but some of 4 series and others are 7)
* Allow parallel test execution, true
* Workpace Base path outside of git repo
* Cache storage outside of git repo
* Max number of processing threads = number of cores dedicated to ncrunch
* Pipeline optimisation priority = responsiveness
* Test process memory limit - 500000000

Dev machine

* i7 4 or 7 series, 8 or 16 cores
* 16 GB ram
* SSD

Grid setup

* 4 Grid node servers
* Ncrunch Grid Node software 3.28
* Each node 8 core i7
* Each node 16 GB ram
* Each node on SSD

==============================

Our CI setup


Build setup

*team city
* MSBuild
* Ncrunch console tool 3.28
* Allow parallel test execution, true
* Workpace Base path outside of git repo
* Cache storage outside of git repo
* Max number of processing threads = number of cores

Agent machine

* Azure vms
* 8 core
* 32 GB ram
* SSD (Azure equivalent)

Grid setup

* 3 Grid node servers (1 on each agent)
* Ncrunch Grid Node software 3.28
LawrenceShort
#6 Posted : Tuesday, November 19, 2019 9:26:07 AM(UTC)
Rank: Newbie

Groups: Registered
Joined: 7/9/2019(UTC)
Posts: 9
Location: United Kingdom

Was thanked: 1 time(s) in 1 post(s)
Unfortunately removing the trace output hasn't resolved the issue.
Remco
#7 Posted : Wednesday, November 20, 2019 3:52:23 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 5,869

Thanks: 763 times
Was thanked: 980 time(s) in 933 post(s)
Thanks for sharing all this detailed information. Given everything you've provided, I think I can firmly conclude that the problem here is caused by the size of NCrunch's in-memory code coverage database, which is held inside the engine host process.

The coverage database is the most complex area of the product, and has been revised several times since V1. NCrunch needs to store a vast amount of code coverage data and be able to rapidly merge this in real-time from many background test runs. For this system to work, we're right now allocating about 60 bytes of memory for every covered line of code for each test. So if you have 28821 tests and each one covers 1% of a 655822 line codebase, we'd be looking at 28821*655822*0.01 = 189,014,459 data points or about 11.3GB of memory.

So I guess it adds up. With the current internal design of NCrunch and 28k tests with relatively high coverage density over a 656k codebase, 16GB of memory simply won't be enough to hold all coverage data in a fully indexed form and still have space for other essentials (such as VS itself, other metadata derived from the tests, background execution processes, etc). It may be possible to stretch this limit through additional paging/swapping to disk, though if this is possible you'll end up paying a price for it in performance.

We're always looking to improve the performance of NCrunch's coverage system, but there's no low hanging fruit left in this area. To further optimise we'll need to do some significant redesign, which would involve considerable time and risk. Assuming we find a way to push this further, it's highly unlikely we'll be able to do so within the next year.

Given the above, I can only suggest the following options:
- Increase the available memory on your dev workstations and CI server. Considering the cost of developer time and the availability of hardware, this is the option I would most recommend.
- Consider consolidating your tests. 28k tests is a very high number for the size of your codebase, so my assumption is that many of these tests are multi-dimensional/generated test cases (TestCase, TestCaseSource) or otherwise exist through inheritance structures. There may be opportunities to reduce the total number of tests by sacrificing some level of granularity in reporting their results
- A less appealing but totally possible option may be to take out coverage tracking over some parts of your solution using NCrunch's code coverage suppression comments or by turning off the 'Instrument Output Assembly' setting. The less coverage is tracked, the smaller the in-memory database will be.
- Breaking down your overall solution into sub-solutions could reduce the amount of code required to be indexed and will free up available memory. I don't expect it will be easy to do this and you'll probably take a productivity hit, but it might still be an option.

My experience suggests that the dimensions of your solution are at a critical point where your existing hardware will likely be suffering considerably with many tools in the .NET space. VS2019 alone requires about 4 logical cores to run on even a moderate sized solution without any serious performance hit, and adding resharper to this would leave very little room to move. I don't think you'd regret a hardware upgrade.

Updating NCrunch to V4 will improve performance across the board, but will not reduce the memory consumption of the code coverage database.
LawrenceShort
#8 Posted : Wednesday, November 20, 2019 9:44:15 AM(UTC)
Rank: Newbie

Groups: Registered
Joined: 7/9/2019(UTC)
Posts: 9
Location: United Kingdom

Was thanked: 1 time(s) in 1 post(s)
Hi Remco,

That's where I was heading with my assumptions. I've been looking at the CI build as this is the area that is really hampering our process and have found that if I turn off the InstrumentOutputAssembly setting then the console tools memory usage drops considerably. Which allows the TestRunner processes to grow in memory utilisation as well. We're still receiving OutOfMemoryExceptions when reaching around 14,000 test runs, but other areas of the test process have definitely become faster, it was taking around an hour to finish reporting the number of ignored tests and now it's a few minutes. I've noted that the test processes are spawning as x86 processes and believe that this may be affecting the size that the Test Runners can grow to. The OutOfMemoryExceptions seem to be surfacing now when the Test Runner processes are reaching around 1.4GB in size. I believe that we have a memory leak in the tests which is causing this growth. My next step is to change the setup of the CI builds to use x64 processes and if the OutOfMemoryExceptions continue to surface here I shall change the memory limit for the Test Processes.
Remco
#9 Posted : Wednesday, November 20, 2019 9:48:54 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 5,869

Thanks: 763 times
Was thanked: 980 time(s) in 933 post(s)
LawrenceShort;14109 wrote:

That's where I was heading with my assumptions. I've been looking at the CI build as this is the area that is really hampering our process and have found that if I turn off the InstrumentOutputAssembly setting then the console tools memory usage drops considerably. Which allows the TestRunner processes to grow in memory utilisation as well. We're still receiving OutOfMemoryExceptions when reaching around 14,000 test runs, but other areas of the test process have definitely become faster, it was taking around an hour to finish reporting the number of ignored tests and now it's a few minutes. I've noted that the test processes are spawning as x86 processes and believe that this may be affecting the size that the Test Runners can grow to. The OutOfMemoryExceptions seem to be surfacing now when the Test Runner processes are reaching around 1.4GB in size. I believe that we have a memory leak in the tests which is causing this growth. My next step is to change the setup of the CI builds to use x64 processes and if the OutOfMemoryExceptions continue to surface here I shall change the memory limit for the Test Processes.


This is good thinking. Disabling this setting for your CI build will improve the overall speed of the test run and greatly reduce memory consumption. You'll lose some smarter prioritisation through impact detection and you won't be able to track the code coverage in the CI reports, but usually this is a small price to pay for having a build that actually finishes executing.

Setting your test process memory limit to a much smaller value should pave over the test memory leak inside your CI system.
LawrenceShort
#10 Posted : Wednesday, November 20, 2019 9:55:28 AM(UTC)
Rank: Newbie

Groups: Registered
Joined: 7/9/2019(UTC)
Posts: 9
Location: United Kingdom

Was thanked: 1 time(s) in 1 post(s)
The problem we are receiving with NCrunch restarting all tests from the beginning every time we restart a machine, could this be caused by not having the memory available to load up the code coverage data?
Remco
#11 Posted : Wednesday, November 20, 2019 10:00:35 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 5,869

Thanks: 763 times
Was thanked: 980 time(s) in 933 post(s)
LawrenceShort;14111 wrote:
The problem we are receiving with NCrunch restarting all tests from the beginning every time we restart a machine, could this be caused by not having the memory available to load up the code coverage data?


I strongly suspect it might. Once OutOfMemoryExceptions start appearing, stability just goes out the window. There is a vast amount of allocation that happens when the .cache file is loaded from disk, and any failure during this process would result in its data being discarded. This would mean NCrunch rediscovering the tests and starting from scratch.
LawrenceShort
#12 Posted : Wednesday, November 20, 2019 12:48:04 PM(UTC)
Rank: Newbie

Groups: Registered
Joined: 7/9/2019(UTC)
Posts: 9
Location: United Kingdom

Was thanked: 1 time(s) in 1 post(s)
Great, thanks for the help so far.
LawrenceShort
#13 Posted : Wednesday, November 20, 2019 6:27:10 PM(UTC)
Rank: Newbie

Groups: Registered
Joined: 7/9/2019(UTC)
Posts: 9
Location: United Kingdom

Was thanked: 1 time(s) in 1 post(s)
After applying the change to the Test Processes architecture to be x64 we now no longer get the OutOfMemory exceptions which is great.
We're going to have to look at what we can do with the developer machines to make them work properly with NCrunch, however we seem to have surmounted the issue in the CI.

Thank you for your help.
1 user thanked LawrenceShort for this useful post.
Remco on 11/20/2019(UTC)
Users browsing this topic
Guest
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

YAF | YAF © 2003-2011, Yet Another Forum.NET
This page was generated in 0.089 seconds.