[AusNOG] Optus downtime chat + affecting SMS verification toTelstra?
Reuben Farrelly
reuben-ausnog at reub.net
Tue Nov 14 13:35:43 AEDT 2023
There is sometimes the option of a session reset and session restart at
a specified interval after the event has triggered. Vendor dependent of
course, but the option exists in IOS XE at least and most likely other
vendors too. This allows for a recovery once too many prefixes have
been received.
That probably would have saved a lot of site visits for Optus once the
root cause of the prefixes was fixed at the edge.
Then there is the unanswered question of where an Out Of Band management
network fitted into this picture which also likely would have provided a
get-out-of-jail-free card much earlier in the day.
Reuben
On 14/11/2023 1:27 pm, John Edwards wrote:
> The default behaviour of the "maximum prefix" BGP feature is to bring
> down the BGP session with the peer.
>
> The alternate behaviour is to log a warning and accept a prefix.
>
> I am not aware of an implementation that just allows "Accept up to X
> routes and then don't accept any more".
>
> That sounds logical but in reality would lead to inconsistent behaviour
> that is more readily addressed with existing routing policy tools.
>
> It appears that a failure of routing policy was a major contributor to
> an Optus outage, where that policy had an assumption of
> trusting internal peers and the fault was exacerbated by some mechanism
> where a policy failure was able to impact other logical networks on the
> same device (assuming there is/was more than 1 logical network).
>
> Or maybe someone just leaked full routes into OSPF 🫠
>
> John
>
>
> _______________________________________________
> AusNOG mailing list
> AusNOG at lists.ausnog.net
> https://lists.ausnog.net/mailman/listinfo/ausnog
More information about the AusNOG
mailing list