Hi, thanks for sharing this issue.
The connection to grid nodes is essentially a barebones TCP connection, with some encryption over the top if a password is set on the node.
Both the client and the server each use dedicated threads to write and read to/from the socket. The system is designed such that if no data is sent for 30 seconds, a keepalive packet is automatically sent by the dedicated thread. This cannot be interfered with by other activity on the system, unless it's REALLY heavily over capacity to the point where nothing on the system is effectively running (NCrunch can do this if you run too much at once, extreme thread starvation is bad for stability).
On the receiver side, if no data is received from the socket for 60 seconds, an automatic disconnection is triggered as it's assumed that the socket is dead.
Network issues like this are often challenging to troubleshoot, since there's so many ways connectivity can go wrong and usually you only get one result (no data received). The grid protocol used by NCrunch has been in place for over a decade now and it should be reliable. It may be worth doing some deductive testing over your network to see if anything is interfering with the connection.