[AusNOG] Best practices on speeding up BGP convergence times

Rhys Hanrahan rhys at nexusone.com.au
Mon Feb 26 12:23:48 EST 2018


Hi Everyone,

I’ve been looking at improving our BGP configuration lately, and I would just like to see if I’m missing anything obvious in terms of speeding up BGP convergence (particularly inbound convergence) with our transit providers during failover.  I understand that BGP convergence on the internet is not going to be perfect, but I am trying to ensure I tune things as best I can. We are using Cisco ASR1001-Xs for reference, though I’m more wondering about general best practices that other ISPs use, that I can then adapt to our network.

I’m also curious if my expectations of trying to minimise convergence times with transit peers are realistic or not.

Right now, I am seeing 20-30 second outage windows when failing over my announced prefixes from one transit provider to another. I can understand this when transitioning between transits, but I see this even when failing over between a primary/secondary peering session with a single AS / transit provider, which is disappointing. My hope was to have almost no interruption where we have multiple links with a given transit provider, and small convergence window (maybe 2-5s?) when transitioning prefixes from one transit to another.

With iBGP seems like there’s lots of options and it would be possible to achieve sub-second convergence fairly easily. But eBGP is where it becomes more limited and difficult to improve the situation.

For iBGP I can do:


  *   BFD
  *   BGP Multipath – I haven’t tested, but I assume having multiple paths in the FIB will speed up failover convergence.
  *   BGP Best External
  *   Add Path

For eBGP I can do:


  *   BFD (If supported by the upstream – I have this on all peers)
  *   Advertisement Internal – I have set this to 0 (doesn’t make a major difference, but helps a little)
  *   BGP Multipath (if supported by the upstream – unfortunately my upstream requires the primary/secondary paths are enforced on their side via localpref so I can’t leverage this).
  *   AS Path prepending of the same prefixes instead of announcing less/more specific prefixes at different sites seems to help.

I haven’t found any other commonly accepted methods of announcing a backup path to eBGP peers.

We are using Equinix Connect transit in Sydney as our main transit, where we have primary and secondary links between us and Equinix. And Vocus as our main transit in Melbourne, with the intention of failing over all our announced prefixes between sites as required, by leveraging AS Path prepending.

Are there any other techniques or best practices I am missing to help try and reduce downtime in the event of a router or BGP session failure event?

Appreciate any insights you can offer, and hope this proves to be a useful and interesting discussion for others.

Thanks!

Rhys Hanrahan
Chief Information Officer
Nexus One Pty Ltd

E: support at nexusone.com.au<mailto:support at nexusone.com.au>
P: +61 2 9191 0606
W: http://www.nexusone.com.au/
M: PO Box 127, Royal Exchange NSW 1225
A: Level 10 307 Pitt St, Sydney NSW 2000

[ttp://quintus.nexusone.com.au/~rhys/nexus1-email-sig.jpg]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20180226/803a106e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 17038 bytes
Desc: image001.jpg
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20180226/803a106e/attachment.jpg>


More information about the AusNOG mailing list