Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

2 Pages12>
Need to manually open/close Grid Node configuration on remote server almost daily
devdept
#1 Posted : Friday, May 10, 2019 8:20:06 AM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 10/13/2012(UTC)
Posts: 53
Location: Italy

Was thanked: 2 time(s) in 2 post(s)
Hi Remco,

We have purchased a double XEON 12C server for a total of 48 logical CPUs machine dedicated to NCrunch. The sad news is that we need to manually open/close Grid Node configuration on remote server almost daily because it's impossible to connect to it after some time you are using it. I mean, we use it succesfully for a day or two, then when we go back to unit testing we find it missing. The same is true for a different PC we use as server as well.

We own 8 NCrunch licenses.

Do you know why or what we need to check to resolve this?

Thanks,

Alberto
Remco
#2 Posted : Friday, May 10, 2019 8:29:00 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 6,976

Thanks: 930 times
Was thanked: 1257 time(s) in 1170 post(s)
Hi Alberto,

The only thing I can think that the grid node configuration tool might be doing here is restarting the grid node service. So probably restarting the service yourself would have the same effect.

Do you experience the same thing if you run the grid node using NCrunch.GridNode.Console.exe instead of the service process?
devdept
#3 Posted : Friday, May 10, 2019 8:39:01 AM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 10/13/2012(UTC)
Posts: 53
Location: Italy

Was thanked: 2 time(s) in 2 post(s)
Hi Remco,

Do you mean that the next time I get into this problem, instead of doing Start/Programs/NCrunch Grid Node Configuration and close it, I need to run NCrunch.GridNode.Console.exe console program from "C:\Program Files (x86)\Remco Software\NCrunch Grid Node Server" folder?

Thanks,

Alberto
talbrecht
#6 Posted : Friday, May 10, 2019 2:23:28 PM(UTC)
Rank: Member

Groups: Registered
Joined: 5/10/2019(UTC)
Posts: 11
Location: Germany

Thanks: 6 times
Was thanked: 3 time(s) in 3 post(s)
Hi Remco, hi Alberto.

We have had the same or at least a similar problem with our grid nodes. After disabling the NIC power management on the nodes NCrunch is running fine so far.

Have a nice weekend!

-Thomas
Remco
#4 Posted : Friday, May 10, 2019 10:32:03 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 6,976

Thanks: 930 times
Was thanked: 1257 time(s) in 1170 post(s)
devdept;13500 wrote:

Do you mean that the next time I get into this problem, instead of doing Start/Programs/NCrunch Grid Node Configuration and close it, I need to run NCrunch.GridNode.Console.exe console program from "C:\Program Files (x86)\Remco Software\NCrunch Grid Node Server" folder?


By default, the grid node server is hosted as a service under windows.

It doesn't have to be this way. You can also run it as a console application. This will give more visibility into its workings and also give you greater control over it.

Nonetheless, Thomas' suggestion is really good one. NIC/Power-management could mess with the listening ports opened by the server. Make sure you have this disabled.
devdept
#7 Posted : Monday, May 13, 2019 1:16:41 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 10/13/2012(UTC)
Posts: 53
Location: Italy

Was thanked: 2 time(s) in 2 post(s)
Thanks Thomas,

I will try this settings the next time the NCrunch server is stopped.

Alberto
devdept
#5 Posted : Wednesday, September 4, 2019 10:13:01 AM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 10/13/2012(UTC)
Posts: 53
Location: Italy

Was thanked: 2 time(s) in 2 post(s)
Remco;13506 wrote:
devdept;13500 wrote:

Do you mean that the next time I get into this problem, instead of doing Start/Programs/NCrunch Grid Node Configuration and close it, I need to run NCrunch.GridNode.Console.exe console program from "C:\Program Files (x86)\Remco Software\NCrunch Grid Node Server" folder?


By default, the grid node server is hosted as a service under windows.

It doesn't have to be this way. You can also run it as a console application. This will give more visibility into its workings and also give you greater control over it.

Nonetheless, Thomas' suggestion is really good one. NIC/Power-management could mess with the listening ports opened by the server. Make sure you have this disabled.


Hi Remco,

We are tired to stress our local machine because the NCrunch remote service (on a 48 CPU Win2019 server) need to be restarted 8 times a day. I've tried to run the console application using NCrunch.GridNode.Console.exe and got

Now initialising NCrunch Grid Node using console host. Version is 3.30.0.1
[12:07:16.2914-?-1] Node server started - listening on port 41141
Grid node successfully started and is listening for connections on port 41141

But local NCrunch is unable to connect. What else should we do to make it work?

Thanks,

Alberto
Remco
#8 Posted : Wednesday, September 4, 2019 11:40:50 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 6,976

Thanks: 930 times
Was thanked: 1257 time(s) in 1170 post(s)
Hi Alberto,

My bet would be that there is a firewall blocking the connection to the ncrunch.gridnode.console.exe process. Since this process isn't running as a background service application, it's likely to be running under a different user profile and using a different .EXE. Check your windows firewall and any other security software you have installed.

If you're not certain if this is a network issue, it can be worth trying to connect to the remote port using a telnet session to see if the server responds in any way. If the network has no problems, then the grid node server should be able to accept a TCP connection (though from there it will likely just dump unrecognisable data or close/hang the connection if the client is not the NCrunch one).
devdept
#9 Posted : Wednesday, September 4, 2019 12:56:05 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 10/13/2012(UTC)
Posts: 53
Location: Italy

Was thanked: 2 time(s) in 2 post(s)
What exact rule should we add to the windows firewall?

Thanks, Alberto
michaelkroes
#10 Posted : Wednesday, September 4, 2019 5:50:18 PM(UTC)
Rank: NCrunch Developer

Groups: Registered
Joined: 9/22/2017(UTC)
Posts: 277
Location: Netherlands

Thanks: 122 times
Was thanked: 62 time(s) in 59 post(s)
The default TCP port for a gridnode is 41141. You can verify that this hasn't changed by opening the Distributed Processing window and editing the server connection. That should list the TCP port.

This port needs to be accessible and enabled in the firewall.
devdept
#11 Posted : Monday, September 9, 2019 2:37:05 PM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 10/13/2012(UTC)
Posts: 53
Location: Italy

Was thanked: 2 time(s) in 2 post(s)
Should I add an inbound rule for this port? Do you have any step by step instruction to share on this rule creation? We aren't very familiar with this.

Thanks,

Alberto
Remco
#12 Posted : Tuesday, September 10, 2019 12:34:51 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 6,976

Thanks: 930 times
Was thanked: 1257 time(s) in 1170 post(s)
See here for details on how to open a port in Windows Firewall.

Note that we have no firm way of knowing that it's windows firewall blocking the connection. It could also be other firewall software you have installed on the machine. If so, I recommend referring to their appropriate documentation. Unfortunately we can only provide guidance on NCrunch itself .. firewalls are more in the area of network and infrastructure.
devdept
#13 Posted : Monday, September 16, 2019 6:26:24 AM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 10/13/2012(UTC)
Posts: 53
Location: Italy

Was thanked: 2 time(s) in 2 post(s)
Hi Remco,

The port 41141 is already open. I run NCrunch.GridNode.Console and always get this from clients:

Quote:
Connection Failure: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 192.168.1.125:41141


We are tired to restart grid node service 4 times per day.

Please help us!

Thanks,

Alberto
Marcello
#14 Posted : Monday, September 16, 2019 7:20:27 AM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/16/2019(UTC)
Posts: 70
Location: Italy

Thanks: 1 times
Was thanked: 4 time(s) in 4 post(s)
Hi Remco,

As additional info, here is a short video showing currently situation: https://www.screencast.com/t/maWtC1az0L

Let us know if you see something wrong or something else we can do to solve the issue. As Alberto already told, we need to solve the issue ASAP.

Thank you,
Marcello
michaelkroes
#15 Posted : Monday, September 16, 2019 8:03:57 AM(UTC)
Rank: NCrunch Developer

Groups: Registered
Joined: 9/22/2017(UTC)
Posts: 277
Location: Netherlands

Thanks: 122 times
Was thanked: 62 time(s) in 59 post(s)
Here are a couple of things you can try:

- Verify the IP address used
- Use telnet to connect to the ip address/port from both the grid node and the client to see if a connection can be established (This will tell you if the problem is network related, if telnet can't accept a connection then there is a network issue. This isn't something we can help you with)
- Do you have visual studio installed locally on the grid node? If so can you try connecting to the node from that machine? If that does connect, this is a network issue.

Let me know how these things go :)

Michael
Marcello
#16 Posted : Tuesday, September 17, 2019 11:59:34 AM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/16/2019(UTC)
Posts: 70
Location: Italy

Thanks: 1 times
Was thanked: 4 time(s) in 4 post(s)
Hi Michael,

I tried with telnet when NCrunch was Online and it seems there is some problem trying to connect to it from the local grid node.

Here is a short video showing attempts to open a telnet connection both from my local machine and from the NCRUNCH server: https://www.screencast.com/t/TevXGH8PO

What am I doing wrong? Why telnet on the local server does not run?

BTW we don't have Visual Studio installed on the server, sorry.
Remco
#17 Posted : Tuesday, September 17, 2019 11:48:32 PM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 6,976

Thanks: 930 times
Was thanked: 1257 time(s) in 1170 post(s)
Hi Marcello,

The NCrunch grid node server opens a standard TCP socket on the specified port (defaults to 41141) and listens for connections using Windows Sockets. This is very typical of any kind of network enabled application.

TCP servers can accept connections from any kind of tool that can open a TCP connection (such as telnet). This makes telnet a very useful way to know whether the problem is in the network/infrastructure or the software on the receiving end (i.e. the NCrunch server).

Because you are unable to open a telnet connection to the NCrunch server while it is listening and online, this means that the problem you're experiencing is in one of the following areas:

1. Interference with other software, such as a firewall which is blocking the connection
2. Poor network connectivity (i.e. bad cable, NIC, etc)
3. Server configuration problem (i.e. a Windows-based setting or device driver that is forcefully closing or blocking connections)
4. Other non-NCrunch related infrastructure problem

Because the problem being experienced is not being caused by any defect or UX issue that is specific to our product, unfortunately we do not have the capacity to help you resolve it. I recommend reading up more on troubleshooting network connections and investigating the configuration of your Windows Server instead.
Marcello
#18 Posted : Wednesday, September 18, 2019 10:42:50 AM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/16/2019(UTC)
Posts: 70
Location: Italy

Thanks: 1 times
Was thanked: 4 time(s) in 4 post(s)
Hi Remco,

NCrunch server is a new Windows 2019 Server machine (Version 1809 - OS Build 17763.107) without any other software installed. It's a dedicated server for NCrunch and both network and firewall seems ok.

I made some other stress-tests on the machine and I recorded a short video showing the situation when the service freeze: https://www.screencast.com/t/KeSiZVoF

In the beginning, I can connect via telnet both from my machine and from the local server. After starting a huge number of tests from different machines and waiting for cca 1 hour, the NCrunch service closes the connections and I'm not able to reconnect to the machine for a while (you can see it cca at minute 3:10 of the video).

After that, I cannot connect via telnet anymore from my machine but it responds from localhost. Restarting NCrunch service from my VisualStudio does not work (Why? I suppose because NCruch service is not reachable from another machine).

I tried also to run the NCrunch console, but nothing changed :/

Checking the Windows event viewer I cannot find anything useful to understand what happens to NCruch service... So please, help us to understand why it freezes and what we can do or check to avoid this.
Maybe there is some special configuration for the NCrunch service to be restarted automatically or some special log we can enable in NCrunch to give you more clues on what happens... or something else we can do to solve this situation.

I had to restart manually the NCrunch service to make it work again, but this is not acceptable for us, we cannot do it 2-3 times all the days, I hope you understand.

BTW I tried also to stop the NCrunch service and run only NCrunch.GridNode.Console: it tells "Grid node successfully started and is listening for connections on port 41141" but if I try to check with 'netstat -an | find /i "listening"' it seems to not work, so what is the purpose of NCrunch Console?
Remco
#19 Posted : Wednesday, September 18, 2019 11:08:32 AM(UTC)
Rank: NCrunch Developer

Groups: Administrators
Joined: 4/16/2011(UTC)
Posts: 6,976

Thanks: 930 times
Was thanked: 1257 time(s) in 1170 post(s)
Hi Marcello,

The symptoms you've described match with a system related issue on the server that is closing or interfering with open network connections after an elapsed time. This could be anything ranging from the hardware on the machine up through the device drivers or potentially security software or something in the operating system, but it's basically outside of our control. It's quite likely that the problem would affect any sockets left open on the machine, so probably using NCrunch.Console won't give a different result to the service.

We exhaustively test our software over network connections and have many users that rely on it on a daily basis. At present we have no known issue in the software that can cause this problem. There are no configuration options to tinker with here as the opening of a network socket is a very simple and well tested process.

As mentioned earlier, I am sorry but we cannot help you with this problem.
Marcello
#20 Posted : Wednesday, September 18, 2019 11:58:43 AM(UTC)
Rank: Advanced Member

Groups: Registered
Joined: 9/16/2019(UTC)
Posts: 70
Location: Italy

Thanks: 1 times
Was thanked: 4 time(s) in 4 post(s)
Hi Remco,

I enabled the NCrunch log and I was able to reproduce the same situation than before, where NCrunch service is not reachable.
Looking to the log file it seems there are some exceptions from nCrunch too, something like this:


Code:
[13:30:52.3055-GridMessageSender-384] Failed to send messages because of an error (was the connection closed?): System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'System.Net.Sockets.NetworkStream'.
   at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
   at nCrunch.Core.Grid.Connectivity.EncryptorStream.(Byte[] , Int32 , Int32 )
   at nCrunch.Core.Grid.Connectivity.EncryptorStream.Write(Byte[] buffer, Int32 offset, Int32 count)
   at nCrunch.Core.Grid.Connectivity.BidirectionalStream.Write(Byte[] buffer, Int32 offset, Int32 count)
   at nCrunch.Core.Grid.Connectivity.GridMessageSender.()



Maybe this compromise the correct running of the service?

Here you can download the log file: https://we.tl/t-Wj61RDyWqs

Please have a look at it and let me know what you think.


As a side note, How does NCrunch.GridNode.Console work? As I wrote before, running it without the service active seems to not work :/

Users browsing this topic
Guest
2 Pages12>
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

YAF | YAF © 2003-2011, Yet Another Forum.NET
This page was generated in 0.113 seconds.
Trial NCrunch
Take NCrunch for a spin
Do your fingers a favour and supercharge your testing workflow
Free Download