[AusNOG] GRE Tunnel MTU suggestions

Mark ZZZ Smith markzzzsmith at yahoo.com.au
Tue Jul 1 18:19:59 EST 2014





----- Original Message -----
> From: Joseph Goldman <joe at apcs.com.au>
> To: "AusNOG at lists.ausnog.net" <AusNOG at lists.ausnog.net>
> Cc: 
> Sent: Tuesday, 1 July 2014 4:56 PM
> Subject: [AusNOG] GRE Tunnel MTU suggestions
> 
> Hi List,
> 
>   Setting up a GRE tunnel for a customer and would appreciate a bit of 
> input.
> 
>   I can successfully push 1500byte packets with df-bit set between the 2 
> endpoints (1501 fails), so it is a full mtu of 1500.
> 
>   I'd like to set an ip mtu on the tunnel and an ip tcp adjust-mss. 
> Obviously I can't use 1500 as we have to account for GRE, so I'd like to 
> 
> know the best suggestions for an MTU,

Assuming IPv4, IPv4 packet size would be 20 octets (as you'll probably have no IPv4 options), GRE header is normally 4 octets, so your tunnel MTU would be 1500 - 20 - 4 or 1476.

You may be able to eliminate the GRE header if you're only transporting IP (i.e., an IP in IP tunnel, rather than an IP in GRE in IP tunnel) if your tunnel endpoints support that, which would then result in a tunnel MTU of 1480.


> and if its worth setting the MSS 
> at the same size as the MTU or if I should lower the MSS adjust and if 
> so by how much?
> 

PMTUD is better to use, MSS adjusting is a TCP specific hack. Don't switch it on unless you need to because PMTUD is broken.

Make sure your tunnel endpoints can send ICMP Destination Unreachable, Packet Too Big messages back to the origin hosts so that they'll lower their packet size if it would exceed the tunnel MTU, allowing them to discover the path MTU for paths that include the tunnel.

I'm getting a bit rusty, however IIRC, some tunnel implementations can dynamically discover the path MTU supported by the current path the tunnel is traversing. They do that by setting the DF bit in the outer IP header of all tunnel packets, and when those are dropped, use the returned MTU to work out what the tunnel's inner MTU currently is. This may be useful to enable/investigate even if you are manually setting the tunnel MTUs, to over the case where the tunnel might switch to a path MTU due to a fault.

From memory there can also be a halfway measure, where the DF bit in the outer packet is set to match the value of the packet received to be sent across the tunnel. So PMTUD for the tunnel happens when the hosts initiate PMTUD themselves, and the hosts' packets that don't have DF set that are too big for the tunnel get fragmented before they're sent across the tunnel rather than being dropped. Fragmentation instead of dropping might sound better however "Fragmentation (is) Considered Harmful" (google it) for a variety of reasons (basically, the unit of recovery should be the unit of transmission - loss of a single fragment means all of the fragments have to be retransmitted, rather than just the one lost, which is quite inefficient if you have reasonable loss, and fragmentation is hard if not impossible to do in hardware - fragmentation is commonly punted to software (if your platform supports punting to software), significantly limiting its throughput.)


>   Note: IPSec is not used on top, just GRE.
> 
> Thanks,
> Joe
> _______________________________________________
> AusNOG mailing list
> AusNOG at lists.ausnog.net
> http://lists.ausnog.net/mailman/listinfo/ausnog
> 


More information about the AusNOG mailing list