[AusNOG] Telstra Network Down

Mark Smith markzzzsmith at gmail.com
Thu Feb 2 18:50:59 EST 2017


On 2 Feb. 2017 4:30 pm, "Chad Kelly" <chad at cpkws.com.au> wrote:

On 2/2/2017 3:19 PM, ausnog-request at lists.ausnog.net wrote:

> Of course when people say we have 2 core data centers, this should imply no
> data center is allowed to run over 50% capacity. It's odd/strange that 3
> active core data centers should sound so unorthodox, yet this is the only
> way to assure you can run your DCs at 65% and handle a DC going black. Begs
> the question why 4 active core DCs isn't standard architecture for core
> national infrastructure (which would assure high availability under 75%
> load), and 2x efficient in idle infrastructure.
>

I like your idea in theory.


It's not theory. At one of the ISP's I've worked for we scaled out BRASes
this way. As you add units of capacity, the required redundancy capacity
required to cover a single unit failure reduces across all the other units.
It works when you can divide your problem up into smaller sub-problems and
distribute them across a pool.

The argument sometimes used against it is that it is more devices to
manage. True, however that is tractable by using config templates,
automation and device management platforms ("software defined networks").
The problems of managing many devices is not a new one if you've spent any
time managing fleets of desktop PCs.







But building data centres costs money and a significant amount of it.




You get what you pay for. If you need high availability, you need to be
prepared to pay the price if it. If you can't afford the price, then it is
likely your availability requirements are greater than they really need to
be. Put a dollar cost against the consequence of a failure, and you might
find you really do need to pay the price of the HA you want.

If you can't afford to build DCs, you rent space in other people's to meet
your availability goals.



I remember when the Warrnambool exchange fire occurred, a discussion was
had around fire suppression and the lack of it in a critical exchange for
regional Victoria.

Begs the question did they have appropriate levels of fire suppression
equipment installed?

No good having multiple lots of equipment if its not being protected from
fire properly.



A better architecture is one where a facility fire has a far smaller impact.

Your unit of expansion is your potential unit of failure. Larger units of
expansion, larger consequences of failure.

Regards,
Mark.





Regards Chad.




-- 
Chad Kelly
Manager
CPK Web Services
web www.cpkws.com.au
phone 03 9013 4853


_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net
http://lists.ausnog.net/mailman/listinfo/ausnog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20170202/9ad79e11/attachment.html>


More information about the AusNOG mailing list