Ximeng Guan
2018-09-29 22:43:52 UTC
Hello,
We recently found that volume release and movement between file servers running at two different sites fail frequently because the path MTU between the two sites is smaller than the default value of 1500.
The failure happens particularly between a file server running 1.6.21 and a file server running 1.6.22.3 on the other site. It does not seem to happen between two file servers both running 1.6.21 on the two ends.
Although setting the file server's NIC mtu (e.g., in the file ifcfg-eth0) to the true path max MTU value seems to be an ad-hoc solution, we notice that the volserver has a new option of -rxmaxmtu in 1.8. The auristor version also has the same option in its documentation. But the 1.6.x version does not.
Is there a plan to introduce this option into 1.6.x? Can someone provide any background or history on this option for the volume server?
Thank you!
P.S., when the failure happens, we typically see a symptom like this: Probing the sender with rxdebug at port 7005 shows an active connection to the receiver with a flag of "has_output_packets". Probing the receiver at the same port does not show the corresponding connection.
Best regards,
========================================
Ximeng (Simon) Guan, Ph.D.
Associate Principal Engineer
Royole Corporation
48025 Fremont Blvd, Fremont, CA 94538
========================================
We recently found that volume release and movement between file servers running at two different sites fail frequently because the path MTU between the two sites is smaller than the default value of 1500.
The failure happens particularly between a file server running 1.6.21 and a file server running 1.6.22.3 on the other site. It does not seem to happen between two file servers both running 1.6.21 on the two ends.
Although setting the file server's NIC mtu (e.g., in the file ifcfg-eth0) to the true path max MTU value seems to be an ad-hoc solution, we notice that the volserver has a new option of -rxmaxmtu in 1.8. The auristor version also has the same option in its documentation. But the 1.6.x version does not.
Is there a plan to introduce this option into 1.6.x? Can someone provide any background or history on this option for the volume server?
Thank you!
P.S., when the failure happens, we typically see a symptom like this: Probing the sender with rxdebug at port 7005 shows an active connection to the receiver with a flag of "has_output_packets". Probing the receiver at the same port does not show the corresponding connection.
Best regards,
========================================
Ximeng (Simon) Guan, Ph.D.
Associate Principal Engineer
Royole Corporation
48025 Fremont Blvd, Fremont, CA 94538
========================================