[AusNOG] GLBP Forwarder Weighting Question

Nathan Nogic nathan at manageddatasolutions.com.au
Sun Apr 14 17:13:21 EST 2013


Hey Mark,

Thanks for the reply, yep proxy arp is disabled globally.

Routers are connected, however, each has an external connection to a different carrier and the routers are configured to send traffic out of their own external interfaces. We looked into traffic flow last night also and what we could see was the following:

* Cleared ARP caches
* 10 weighted router had 2 hops to the destination (TPG) and 200 weighted router had 3 hops to the destination, however, the routers are configured to forward data only via their external interfaces. 
* To confirm that it wasn't a case of routing through a peer router, we conducted tracerts and data to the TPG subnets from the 200 weighted router and went out perfectly over its external interface. 
* After that we did a tracert/wireshark capture to the TPG subnets from other devices on the same subnet and behind that GLBP VIP and they all exited the network from the 200 weighted router as expected.
* Checked the IP tables and BGP tables on the 200 weighted router and they show the local interface as the next hop for the subnets in question.

We then decremented the 10 weighted routers GLBP AVF for the last 12 hours to take it to standby and traffic flowed as expected through the 200 weighted router (surmising the gateway on the client is correct and not manually forced to the 10 weighted router). As soon as the IP SLA was brought online for the 10 weighted router and the AVF went back to listen mode, traffic from the same devices destined to TPG went straight back through the 10 weighted router.

Anyway, just wanted to thank everyone for their suggestions, I think next steps is to get access to the server to see what is happening on the actual device.

Cheers

Nathan 

-----Original Message-----
From: Mark Smith [mailto:markzzzsmith at yahoo.com.au] 
Sent: Sunday, 14 April 2013 1:32 PM
To: Nathan Nogic; ausnog at lists.ausnog.net
Subject: Re: [AusNOG] GLBP Forwarder Weighting Question




>________________________________
> From: Nathan Nogic <nathan at manageddatasolutions.com.au>
>To: ausnog at lists.ausnog.net
>Sent: Saturday, 13 April 2013 7:05 PM
>Subject: [AusNOG] GLBP Forwarder Weighting Question
> 
>
>
>Hi guys,
> 
>Breaking with recent tradition, I thought I’d throw a network question out to the group to see if anyone has had a similar experience or can point me at some documentation other than the cisco articles on how GLBP AVFs should work.
> 
>The long and short of it is that the AVFs seem to be ignoring the weighting when deciding which connection to send outbound traffic through. I know we could drop one of the AVFs by decrementing its value below the lower threshold but that would mean removing some of the load balancing and availability options.
> 
>The setup is a GLBP group of two routers each with their own IP transit connection. One router & link has a weighting of 10 and the other router/link has a weighting of 200 with the load balancing methodology set to weighted. In theory, the router with the weighting of 200 should be taking 95 odd % of the traffic for that subnet.
> 
>The config is as follows:
> 
>·         Downstream devices are configured to hit the VIP not the 
>individual router IPs ·         The ARP tables on the router with the weighting of 200 shows that it is picking up most of the IP addresses in the subnet as expected as it is also the AVG.
>·         The router with the weighting of 200 is shown as the Active 
>AVF and the router with the weighting of 10 is in listen state (aware 
>that a listen AVF will still forward traffic) ·         The router with 
>the weighting of 10 has an ARP table that does not show any IPs from the subnet ·         Forwarder pre-emption is enabled ·         Bouncing the router with the weighting of 10 has the traffic redirect through the other link, but once the AVF comes up (even in listen state) it then starts transmitting traffic again.
>·         Netflow reporting shows that the router with the weighting of 10 is transmitting a lot of data that should be going through the other router.
> 
>My question is whether it’s just luck that the small amount of IP addresses allocated to the AVF with the weighting of 10 happen to have noticeable traffic or if there is some other behaviour that would explain why an AVF that should be getting virtually no traffic seems to be sending out a lot of traffic for that subnet?
> 
>The other working theory is that it’s ARP / client device affinity rather than a GLBP issue.
> 
>Happy to get thoughts by direct email rather than to the entire list.
> 
>Don't have any experience with GBLP, but have an rough understanding of how it works. So the following is some random ideas/questions:
>
>
>Firstly, what is your topology? Specifically, are the two routers also 
>directly attached to each other, or are they running a routing protocol between each other on the GLBP protected link, as well as a common upstream network? Assuming in your Netflow point you're talking about traffic outbound from the GLBP protected subnet, a possible explanation is that your Weight 10 router, even though it isn't preferred by GLBP, is preferred by the Weight 200 router to reach the rest of the network via a direct link, or back out via the GBLP protected link. That is, the traffic is coming from the hosts to the weight 200 router because of GLBP, then the weight 200 router looks up its route table and finds that the Weight 10 router is a better path towards the packets destination, and then forwards packets towards the Weight 10 router, either via the direct link between them, or back out the interface that the Weight 200 router just received the traffic over.
>
>
>Secondly, (this is a bit of a stab in the dark), have you disabled 
>Proxy ARP on the GLBP interfaces? ARP always listens to the most 
>recently received ARP Reply, so a possible explanation for why you're 
>seeing no traffic on the Weight 10 router is that the Weight 200 router 
>is not only sending an ARP reply on behalf of the Weight 10 router to 
>get some traffic to go via the Weight 10 router for the purposes of 
>GLBP, but the Weight 200 router is also sending an unrelated ARP reply 
>with it's own MAC address for the default gateway address. If this 
>non-GBLP ARP reply always arrives later than  the GBLP reply, then all 
>traffic will go through the Weight 200 router. You could verify that 
>something like this is happening by running a packet sniffer on on of 
>the hosts, clear the host's ARP cache, ping the default gateway and 
>then see if two ARP replies are received by the host. The ARP reply 
>ethernet source address will identify which device(s) are emitting the
 multiple ARP replies. Multiple ARP replies, with the later one always being the Weight 200 router would fit a number of the symptoms you've described above. I'm speculating Proxy ARP only because is a default for Cisco routers (always switch it off by default, it causes more trouble than it is worth, and if you need it you'll know about it and what you're going to use it to do (you'll be using it to implement "Transparent Subnet Gateways" (RFC1027)) and it may be acting independently of GBLP.
>
>
>
>
>HTH,
>mark.
>
>




More information about the AusNOG mailing list