[AusNOG] GLBP Forwarder Weighting Question

Mark Smith markzzzsmith at yahoo.com.au
Sun Apr 14 13:32:18 EST 2013




>________________________________
> From: Nathan Nogic <nathan at manageddatasolutions.com.au>
>To: ausnog at lists.ausnog.net 
>Sent: Saturday, 13 April 2013 7:05 PM
>Subject: [AusNOG] GLBP Forwarder Weighting Question
> 
>
>
>Hi guys,
> 
>Breaking with recent tradition, I thought I’d throw a network question out to the group to see if anyone has had a similar experience or can point me at some documentation other than the cisco articles on how GLBP AVFs should work.
> 
>The long and short of it is that the AVFs seem to be ignoring the weighting when deciding which connection to send outbound traffic through. I know we could drop one of the AVFs by decrementing its value below the lower threshold but that would mean removing some of the load balancing and availability options.
> 
>The setup is a GLBP group of two routers each with their own IP transit connection. One router & link has a weighting of 10 and the other router/link has a weighting of 200 with the load balancing methodology set to weighted. In theory, the router with the weighting of 200 should be taking 95 odd % of the traffic for that subnet.
> 
>The config is as follows:
> 
>·         Downstream devices are configured to hit the VIP not the individual router IPs
>·         The ARP tables on the router with the weighting of 200 shows that it is picking up most of the IP addresses in the subnet as expected as it is also the AVG. 
>·         The router with the weighting of 200 is shown as the Active AVF and the router with the weighting of 10 is in listen state (aware that a listen AVF will still forward traffic)
>·         The router with the weighting of 10 has an ARP table that does not show any IPs from the subnet
>·         Forwarder pre-emption is enabled
>·         Bouncing the router with the weighting of 10 has the traffic redirect through the other link, but once the AVF comes up (even in listen state) it then starts transmitting traffic again.
>·         Netflow reporting shows that the router with the weighting of 10 is transmitting a lot of data that should be going through the other router.
> 
>My question is whether it’s just luck that the small amount of IP addresses allocated to the AVF with the weighting of 10 happen to have noticeable traffic or if there is some other behaviour that would explain why an AVF that should be getting virtually no traffic seems to be sending out a lot of traffic for that subnet?
> 
>The other working theory is that it’s ARP / client device affinity rather than a GLBP issue.
> 
>Happy to get thoughts by direct email rather than to the entire list.
> 
>Don't have any experience with GBLP, but have an rough understanding of how it works. So the following is some random ideas/questions:
>
>
>Firstly, what is your topology? Specifically, are the two routers also directly attached to each other, or are they running a routing protocol between each other on the GLBP protected link, as well as a common upstream network? Assuming in your Netflow point you're talking about traffic outbound from the GLBP protected subnet, a possible explanation is that your Weight 10 router, even though it isn't preferred by GLBP, is preferred by the Weight 200 router to reach the rest of the network via a direct link, or back out via the GBLP protected link. That is, the traffic is coming from the hosts to the weight 200 router because of GLBP, then the weight 200 router looks up its route table and finds that the Weight 10 router is a better path towards the packets destination, and then forwards packets towards the Weight 10 router, either via the direct link between them, or back out the interface that the Weight 200 router just received the traffic over. 
>
>
>Secondly, (this is a bit of a stab in the dark), have you disabled Proxy ARP on the GLBP interfaces? ARP always listens to the most recently received ARP Reply, so a possible explanation for why you're seeing no traffic on the Weight 10 router is that the Weight 200 router is not only sending an ARP reply on behalf of the Weight 10 router to get some traffic to go via the Weight 10 router for the purposes of GLBP, but the Weight 200 router is also sending an unrelated ARP reply with it's own MAC address for the default gateway address. If this non-GBLP ARP reply always arrives later than  the GBLP reply, then all traffic will go through the Weight 200 router. You could verify that something like this is happening by running a packet sniffer on on of the hosts, clear the host's ARP cache, ping the default gateway and then see if two ARP replies are received by the host. The ARP reply ethernet source address will identify which device(s) are emitting the
 multiple ARP replies. Multiple ARP replies, with the later one always being the Weight 200 router would fit a number of the symptoms you've described above. I'm speculating Proxy ARP only because is a default for Cisco routers (always switch it off by default, it causes more trouble than it is worth, and if you need it you'll know about it and what you're going to use it to do (you'll be using it to implement "Transparent Subnet Gateways" (RFC1027)) and it may be acting independently of GBLP.
>
>
>
>
>HTH,
>mark.
>
>



More information about the AusNOG mailing list