[AusNOG] VPLS OSPF question

Tue Apr 23 07:20:53 EST 2013

----- Original Message -----
> From: Johann Lo <Johann.Lo at aptel.com.au>
> To: Mark Smith <markzzzsmith at yahoo.com.au>; "ausnog at lists.ausnog.net" <ausnog at lists.ausnog.net>
> Cc: 
> Sent: Thursday, 18 April 2013 8:13 AM
> Subject: RE: [AusNOG] VPLS OSPF question
> 
> I'll bite.
> 
> Ability to summarise per adjacency/interface i.e. exactly where you want it not 
> just on an ABR. <--- BIG ONE in certain scenarios

While not disagreeing that could be useful, I'm curious about some example scenarios. I haven't really come across any where having to summarise 'anywhere' within your topology isn't really a sign that there might be other deeper problems that are being pasted over. Sometimes pasting over problems makes them much bigger ones in the future.

> Ability to designate spoke sites (instead of using stub/NSSA which is only 
> possible if bordering area 0).

Reading a bit into your statement, a router by itself could be a single area, and areas don't have to be directly adjacent to the backbone area - they be logically connected to the backbone via an OSPF virtual link (and forwarding is still optimal, it is only the LSAs that are tunnelled to the backbone area via the virtual link). Some vendors have implemented "totally stubby areas", which means the area just uses a default route to reach other area routes as well as external routes. If your implementation doesn't support totally stubby areas, it may be possible to use the ABR route filtering capability to achieve the same thing.

> Smaller routing tables / overhead on spoke sites (no need for full LSA database 
> of entire area). Though this is somewhat moot given current hardware, a lot of 
> OSPF design literature seems to be 10 years old and assuming C2600-era capacity.

Actually, the paper quoted by the OP (50 routers, 200 links in an area), is a Cisco 2500 era paper (and for those who aren't aware, Cisco 2500s typically had 0.5MB of RAM and a 20Mhz 68020 CPU - the same as in the first Apple Macintosh IIRC) (and as a bit of trivia, IIRC, was written by Bassam Halabi of "Internet Routing Architectures" fame).

> Ability to adjust or filter per neighbour metrics without the downsides of 
> explicitly specifying OSPF NBMA mode (which incidentally eliminates DR/BDR so 
> requires full mesh neighbour adjacencies, do you want 200 sites in a full 
> neighbour mesh, er probably not).
> 

That would be a current limitation of most if not all OSPF implementations. RFC6845, "OSPF Hybrid Broadcast and Point-to-Multipoint Interface Type" (January 2013) addresses that limitation with OSPF, but of course if you can use it depends on what your implementation supports.

What is a bit interesting is that in the context of the VPLS scenario, the VPLS is both hiding the actual costs of send traffic to VPLS sites, by making them all appear to be equally reachable. I'd argue that you shouldn't have to fiddle with metrics in this scenario, because by using a VPLS, you are in effect paying the carrier who is providing that service to you to both take care of and as well as hide from you the actual details of the network's underlying topology and it's utilisation. Unfortunately you start getting exposed to some of this complexity if you have redundant links off of the VPLS to your sites and they're different capacities. A topology you construct yourself using point-to-point VLLs or physical links would better expose the network topology information and allow easier setting of metrics.

> Not saying EIGRP would necessarily be the best solution (but intuitively in an 

> all cisco shop, if you're basically running a star topology then EIGRP stub 
> makes great sense for all the little spoke sites).
> 
> All the guys suggesting really complex solutions, do you really think this is 
> the right fit for the scenario described (ENTERPRISE environment).
> 

There is another drawback to using a standard OSPF broadcast multi-access model for the VPLS. From each of the routers' points of view, for a destination behind (only) one of the other routers, the remaining 198 routers provide a backup path to that destination. For example, Router A wants a path to a destination behind (only) Router Z. Router C also provides a path to that destination, because Router A can forward to Router C and then Router C can forward to Router Z, and the same applies to Routers D, E, F, G etc. In other words, from Router A's point of view, the paths it has available to the destination behind Z are (A -> Z, A -> C -> Z, A -> D -> Z, A -> E -> Z etc.,) i.e. 199 of them. SPF will ensure that as A -> B is the shortest path, it will be picked, however on each SPF run it will need to evaluate (I think, my memory of how SPF works is getting a bit rusty) an average of half of the candidate paths before it finds that A -> Z is the best.
 Remembering that if there is a topology change, all of the 200 routers do an SPF run, that may result in a reasonable processing load on the routers' control planes, and depending on how powerful they are, it might impact reconvergence time.

If some of your VPLS sites had redundant access links to the VPLS, you now have the possibility of Equal Cost Multiple Paths, and that further increases the SPF run time because SPF doesn't just stop when it has found the first shortest path (or shortest path first ;-) ) As supporting ECMP on Ciscos is a default (and likely on other vendors implementations of OSPF), the SPF runs for this 200 site VPLS are going to be relatively long. On a Cisco you can change that by specifying 'maximum-paths 1' under router ospf.

Regards,
Mark.