[AusNOG] GLBP Forwarder Weighting Question

Mark Smith markzzzsmith at yahoo.com.au
Mon Apr 15 19:40:54 EST 2013


Hi,


----- Original Message -----
> From: Nathan Nogic <nathan at manageddatasolutions.com.au>
> To: 'Mark Smith' <markzzzsmith at yahoo.com.au>; ausnog at lists.ausnog.net
> Cc: 
> Sent: Sunday, 14 April 2013 5:13 PM
> Subject: RE: [AusNOG] GLBP Forwarder Weighting Question
> 
> Hey Mark,
> 
> Thanks for the reply, yep proxy arp is disabled globally.
> 
> Routers are connected, however, each has an external connection to a different 
> carrier and the routers are configured to send traffic out of their own external 
> interfaces. We looked into traffic flow last night also and what we could see 
> was the following:
> 
> * Cleared ARP caches
> * 10 weighted router had 2 hops to the destination (TPG) and 200 weighted router 
> had 3 hops to the destination, however, the routers are configured to forward 
> data only via their external interfaces. 
> * To confirm that it wasn't a case of routing through a peer router, we 
> conducted tracerts and data to the TPG subnets from the 200 weighted router and 
> went out perfectly over its external interface. 
> * After that we did a tracert/wireshark capture to the TPG subnets from other 
> devices on the same subnet and behind that GLBP VIP and they all exited the 
> network from the 200 weighted router as expected.
> * Checked the IP tables and BGP tables on the 200 weighted router and they show 
> the local interface as the next hop for the subnets in question.
> 


If you're running BGP on both routers and receiving full feeds, it is probably a better idea to connect them together and share the route tables via iBGP. Then if one of the two providers has a better route to a destination on the Internet, that provider will be used. It will also protect against one of the providers failing, or losing full visibility of the Internet. Just make sure you don't leak one provider's routes to the other, otherwise you might just become the best path between the two. It shouldn't happen, as the providers shouldn't be accepting those sorts of routes from customers, but sometimes they might be.


> We then decremented the 10 weighted routers GLBP AVF for the last 12 hours to 
> take it to standby and traffic flowed as expected through the 200 weighted 
> router (surmising the gateway on the client is correct and not manually forced 
> to the 10 weighted router). As soon as the IP SLA was brought online for the 10 
> weighted router and the AVF went back to listen mode, traffic from the same 
> devices destined to TPG went straight back through the 10 weighted router.
> 


What are you using IP SLA for? Or is that how GLBP monitors the availability of the other routers in the group? 


> Anyway, just wanted to thank everyone for their suggestions, I think next steps 

> is to get access to the server to see what is happening on the actual device.
> 


Server? This is getting a bit more confusing ...


Regards,
Mark.

> Cheers
> 
> Nathan 
> 
> -----Original Message-----
> From: Mark Smith [mailto:markzzzsmith at yahoo.com.au] 
> Sent: Sunday, 14 April 2013 1:32 PM
> To: Nathan Nogic; ausnog at lists.ausnog.net
> Subject: Re: [AusNOG] GLBP Forwarder Weighting Question
> 
> 
> 
> 
>> ________________________________
>>  From: Nathan Nogic <nathan at manageddatasolutions.com.au>
>> To: ausnog at lists.ausnog.net
>> Sent: Saturday, 13 April 2013 7:05 PM
>> Subject: [AusNOG] GLBP Forwarder Weighting Question
>> 
>> 
>> 
>> Hi guys,
>> 
>> Breaking with recent tradition, I thought I’d throw a network question out 
> to the group to see if anyone has had a similar experience or can point me at 
> some documentation other than the cisco articles on how GLBP AVFs should work.
>> 
>> The long and short of it is that the AVFs seem to be ignoring the weighting 
> when deciding which connection to send outbound traffic through. I know we could 
> drop one of the AVFs by decrementing its value below the lower threshold but 
> that would mean removing some of the load balancing and availability options.
>> 
>> The setup is a GLBP group of two routers each with their own IP transit 
> connection. One router & link has a weighting of 10 and the other 
> router/link has a weighting of 200 with the load balancing methodology set to 
> weighted. In theory, the router with the weighting of 200 should be taking 95 
> odd % of the traffic for that subnet.
>> 
>> The config is as follows:
>> 
>> ·         Downstream devices are configured to hit the VIP not the 
>> individual router IPs ·         The ARP tables on the router with the 
> weighting of 200 shows that it is picking up most of the IP addresses in the 
> subnet as expected as it is also the AVG.
>> ·         The router with the weighting of 200 is shown as the Active 
>> AVF and the router with the weighting of 10 is in listen state (aware 
>> that a listen AVF will still forward traffic) ·         The router with 
>> the weighting of 10 has an ARP table that does not show any IPs from the 
> subnet ·         Forwarder pre-emption is enabled ·         Bouncing the router 
> with the weighting of 10 has the traffic redirect through the other link, but 
> once the AVF comes up (even in listen state) it then starts transmitting traffic 
> again.
>> ·         Netflow reporting shows that the router with the weighting of 10 
> is transmitting a lot of data that should be going through the other router.
>> 
>> My question is whether it’s just luck that the small amount of IP addresses 
> allocated to the AVF with the weighting of 10 happen to have noticeable traffic 
> or if there is some other behaviour that would explain why an AVF that should be 
> getting virtually no traffic seems to be sending out a lot of traffic for that 
> subnet?
>> 
>> The other working theory is that it’s ARP / client device affinity rather 
> than a GLBP issue.
>> 
>> Happy to get thoughts by direct email rather than to the entire list.
>> 
>> Don't have any experience with GBLP, but have an rough understanding of 
> how it works. So the following is some random ideas/questions:
>> 
>> 
>> Firstly, what is your topology? Specifically, are the two routers also 
>> directly attached to each other, or are they running a routing protocol 
> between each other on the GLBP protected link, as well as a common upstream 
> network? Assuming in your Netflow point you're talking about traffic 
> outbound from the GLBP protected subnet, a possible explanation is that your 
> Weight 10 router, even though it isn't preferred by GLBP, is preferred by 
> the Weight 200 router to reach the rest of the network via a direct link, or 
> back out via the GBLP protected link. That is, the traffic is coming from the 
> hosts to the weight 200 router because of GLBP, then the weight 200 router looks 
> up its route table and finds that the Weight 10 router is a better path towards 
> the packets destination, and then forwards packets towards the Weight 10 router, 
> either via the direct link between them, or back out the interface that the 
> Weight 200 router just received the traffic over.
>> 
>> 
>> Secondly, (this is a bit of a stab in the dark), have you disabled 
>> Proxy ARP on the GLBP interfaces? ARP always listens to the most 
>> recently received ARP Reply, so a possible explanation for why you're 
>> seeing no traffic on the Weight 10 router is that the Weight 200 router 
>> is not only sending an ARP reply on behalf of the Weight 10 router to 
>> get some traffic to go via the Weight 10 router for the purposes of 
>> GLBP, but the Weight 200 router is also sending an unrelated ARP reply 
>> with it's own MAC address for the default gateway address. If this 
>> non-GBLP ARP reply always arrives later than  the GBLP reply, then all 
>> traffic will go through the Weight 200 router. You could verify that 
>> something like this is happening by running a packet sniffer on on of 
>> the hosts, clear the host's ARP cache, ping the default gateway and 
>> then see if two ARP replies are received by the host. The ARP reply 
>> ethernet source address will identify which device(s) are emitting the
> multiple ARP replies. Multiple ARP replies, with the later one always being the 
> Weight 200 router would fit a number of the symptoms you've described above. 
> I'm speculating Proxy ARP only because is a default for Cisco routers 
> (always switch it off by default, it causes more trouble than it is worth, and 
> if you need it you'll know about it and what you're going to use it to 
> do (you'll be using it to implement "Transparent Subnet Gateways" 
> (RFC1027)) and it may be acting independently of GBLP.
>> 
>> 
>> 
>> 
>> HTH,
>> mark.
>> 
>> 
> 



More information about the AusNOG mailing list