[AusNOG] SPAM-MED: Re: Vocus international service outage

Skeeve Stevens skeeve+ausnog at eintellegonetworks.com
Wed Aug 27 00:23:42 EST 2014


Actually no.. Telstra has had issues like this.  Dodo was dead for many
more hours the other day... much traffic here? Nope.

Global problems are a fact of like as a service provider.

I've always said SLA's are meaningless marketing documents.... get back a
few bucks - maybe...   You need to plan your own networks to survive the
regular problems that occur on this Internet thing.  Failing to plan means
you are planning to fail.

You seemed to have suffered, I acknowledge that... and you want answers....
but those answers are unlikely to change anything.  You are not entitled to
be privy to the network engineering issues of any provider apart from
yourself.

There was a screw-up.. Are people curious? Sure... but those of us who have
been around for a long time, actually assume that someone has had their ass
kicked (if it was a person or design fault), or if it was just a cascading
error, then the situation will be noted, and if the risks of it happening
again are high, they will plan for it.  If it is a perfect storm, then ..
deal with it... get over it.. move on.  I wouldn't expect anyone, not even
Telstra to go spending up big to deal with a random set of events.  We
learn from these things.

If you have an issue with Vocus, talk to you account manager... go to a
different provider, but I am not sure what wandering around with a placard
in public asking what happened - after the fact, helps in any way?  Shit
happened... everyone knows it..


...Skeeve

*Skeeve Stevens - *eintellego Networks Pty Ltd
skeeve at eintellegonetworks.com ; www.eintellegonetworks.com

Phone: 1300 239 038; Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellegonetworks ;  <http://twitter.com/networkceoau>
linkedin.com/in/skeeve

twitter.com/theispguy ; blog: www.theispguy.com


The Experts Who The Experts Call
Juniper - Cisco - Cloud - Consulting - IPv4 Brokering


On 26 August 2014 23:58, Wolfgang Nagele (AusRegistry) <
wolfgang.nagele at ausregistry.com.au> wrote:

>  Hi,
>
>  This is your choice. Not mine and I am sure there are others here that
> do not agree with your idea of brushing issues like this off the table. If
> this were Telstra you would be striking a different tone.
>
>  It’s irrelevant if you have multiple upstreams - there are minimum
> requirements that we put on suppliers in the year 2014. Having Vocus suffer
> a 4 hour degradation due to a single DC fault in SJC is not something that
> we accept in 2014. There is redundancy via LA (and Singapore) which didn’t
> work. I would like to know why - not why it didn’t fail over automatically
> there can be many reasons. Vocus engineers should have been able to
> re-route via LA and completely take SJC out of the equation. There are
> questions to be answered here. As well as delays in notifications - we have
> not received a notification for over an hour. Again not acceptable for an
> incident of that magnitude.
>
>  We all learn based on mistakes - ignoring them gains nothing.
>
>  As for multiple upstreams, yes we have them and yes we routed around the
> issue. To me that’s irrelevant to the issue at hand. If you are happy with
> a supplier that has 4 hour degradation due to a single DC fault on it’s
> main international backhaul - yes - move along, nothing to see here.
>
>  Cheers,
> Wolfgang
>
>   On 8/26/14, 11:40 PM, "Skeeve Stevens" <
> skeeve+ausnog at eintellegonetworks.com> wrote:
>
>   Yup.. move along, nothing to see here.
>
>  Once an outage is fixed, those who dwell on the cause that they can do
> nothing about, are focusing in the wrong place.
>
>  If your only transit was through a single upstream, that is where you
> should be focusing, not the provider.
>
>
> ...Skeeve
>
>  *Skeeve Stevens - *eintellego Networks Pty Ltd
>  skeeve at eintellegonetworks.com ; www.eintellegonetworks.com
>
> Phone: 1300 239 038; Cell +61 (0)414 753 383 ; skype://skeeve
>
> facebook.com/eintellegonetworks ;  <http://twitter.com/networkceoau>
> linkedin.com/in/skeeve
>
> twitter.com/theispguy ; blog: www.theispguy.com
>
>
>  The Experts Who The Experts Call
>  Juniper - Cisco - Cloud - Consulting - IPv4 Brokering
>
>
> On 26 August 2014 23:19, Kristoffer Sheather @ CloudCentral <
> kristoffer.sheather at cloudcentral.com.au> wrote:
>
>>  Shit broke, they fixed.
>>
>> <EOM />
>>
>> ------------------------------
>> *From*: "Wolfgang Nagele (AusRegistry)" <
>> wolfgang.nagele at ausregistry.com.au>
>> *Sent*: Tuesday, August 26, 2014 11:17 PM
>> *To*: "James Spenceley" <james at iroute.org>
>> *Cc*: "Ausnog at ausnog.net" <ausnog at ausnog.net>
>> *Subject*: SPAM-MED: Re: [AusNOG] Vocus international service outage
>>
>> Hi James,
>>
>> Still waiting for the RfO on this whole thing with the details. Neither
>> seen one here nor as a follow-up to customer notifications.
>>
>> Cheers,
>> Wolfgang
>>
>>  On 8/24/14, 12:38 AM, "Wolfgang Nagele (AusRegistry)" <
>> wolfgang.nagele at ausregistry.com.au> wrote:
>>
>>
>>  Hi James,
>>
>> Hmm - can understand that but would have expected that there is
>> sufficient redundancy in the LA landing of your network. I would have
>> expected that a re-route and taking SJC largely out of the equation would
>> be possible. Surprised to say the least …
>>
>> Cheers,
>> Wolfgang
>>
>>  On 8/24/14, 12:10 AM, "James Spenceley" <james at iroute.org> wrote:
>>
>>
>>  Early mail is a power surge in a US DC has damaged both core routers.
>> Wouldn't surprise me if transport from other providers out of that building
>> will be having similar issues.
>>
>> Circuits are being moved directly to borders as we speak.
>>
>>
>>
>> Sent from my iPhone
>>
>> On 24 Aug 2014, at 0:00, Jared Hirst <jared.hirst at serversaustralia.com.au>
>> wrote:
>>
>>
>>   WOW.... It has taken 2 hours to get remote hands to the DC with what
>> seems to be a device with no redundancy?
>>
>>   2014/08/23 13:55
>> UTC
>>
>> Engineers are currently awaiting remote support in the US. Links will be
>> physically moved from the failed device in order to restore services on an
>> alternate device.
>>
>> On Sat, Aug 23, 2014 at 11:29 PM, Andrew Yager <andrew at rwts.com.au>
>> wrote:
>>>
>>>  [hijacking the thread…]
>>>
>>> They say there is a big rewrite coming on the way rpd and sampled
>>> interact in 14.2; and the slow convergance issues have been fixed in more
>>> releases than I care to remember right now… but people say they are pretty
>>> good in the 12.3r6 train. Our MX80's are slated for upgrade to that at some
>>> stage in the next few months.
>>>
>>>  Some noise about this on j-nsp again today.
>>>
>>>  Andrew
>>>
>>>
>>>
>>> On 23 August 2014 23:23, Jonathan Thorpe <jthorpe at conexim.com.au> wrote:
>>>>
>>>>  True, but they otherwise work exceptionally well.
>>>>
>>>>
>>>>
>>>> I’m not sure what kind of PowerPC CPU is doing all the work on an
>>>> MX80’s RE, but I do sometimes wonder if the CPU on a <$40 Raspberry Pi
>>>> might be more up to the job :-P
>>>>
>>>>
>>>>
>>>> *From:* Tony Wicks [mailto:tony at wicks.co.nz]
>>>> *Sent:* Saturday, 23 August 2014 11:13 PM
>>>> *To:* Jonathan Thorpe
>>>> *Cc:* 'Ausnog at ausnog.net'
>>>> *Subject:* RE: [AusNOG] Vocus international service outage
>>>>
>>>>
>>>>
>>>> Well, if you buy the big chassis boxes…..
>>>>
>>>>
>>>>
>>>> *From:* AusNOG [mailto:ausnog-bounces at lists.ausnog.net
>>>> <ausnog-bounces at lists.ausnog.net>] *On Behalf Of *Jonathan Thorpe
>>>> *Sent:* Sunday, 24 August 2014 1:10 a.m.
>>>> *To:* Andrew Yager; Jared Hirst
>>>> *Cc:* Ausnog at ausnog.net
>>>> *Subject:* Re: [AusNOG] Vocus international service outage
>>>>
>>>>
>>>>
>>>> Glad I’m not the only one holding my breath on our MXs :)
>>>>
>>>>
>>>>
>>>> *From:* AusNOG [mailto:ausnog-bounces at lists.ausnog.net
>>>> <ausnog-bounces at lists.ausnog.net>] *On Behalf Of *Andrew Yager
>>>> *Sent:* Saturday, 23 August 2014 10:55 PM
>>>> *To:* Jared Hirst
>>>> *Cc:* Ausnog at ausnog.net
>>>> *Subject:* Re: [AusNOG] Vocus international service outage
>>>>
>>>>
>>>>
>>>> We've done the same (about 20 minutes ago).
>>>>
>>>>
>>>>
>>>> Right now I hate how long Juniper MX's take to stabilise their routing
>>>> table with sampling on.
>>>>
>>>>
>>>>
>>>> Andrew
>>>>
>>>>
>>>>
>>>> On 23 August 2014 22:41, Jared Hirst <
>>>> jared.hirst at serversaustralia.com.au> wrote:
>>>>
>>>>  We have just turned Vocus off. Using other providers for now, as the
>>>> flapping is causing it to go up and down.
>>>>
>>>>
>>>>
>>>> On Sat, Aug 23, 2014 at 10:37 PM, Daniel Watson <Daniel at glovine.com.au>
>>>> wrote:
>>>>
>>>>  Indeed seeing some big drops in gaming traffic at present, normally
>>>> we see above 60mbit on weekends at evenings, but not even seeing 40mbit at
>>>> present :S
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Daniel Watson
>>>>
>>>> Network Administrator / Network Operations Manager
>>>>
>>>>
>>>>
>>>> E Daniel at GloVine.com.au
>>>>
>>>> W www.GloVine.com.au
>>>>
>>>>
>>>>
>>>> *From:* AusNOG [mailto:ausnog-bounces at lists.ausnog.net] *On Behalf Of *Jared
>>>> Hirst
>>>> *Sent:* Saturday, 23 August 2014 10:35 PM
>>>>
>>>>
>>>> *To:* Andrew Cox
>>>> *Cc:* Ausnog at ausnog.net
>>>> *Subject:* Re: [AusNOG] Vocus international service outage
>>>>
>>>>
>>>>
>>>> Yeah we are seeing this! Everything running via them is flapping, they
>>>> claim to have 'routed around it' but that's not the case. Very frustrating
>>>> as it's been an hour and no one there seems to know whats going on....
>>>>
>>>>
>>>>
>>>> On Sat, Aug 23, 2014 at 10:29 PM, Andrew Cox <andrew.cox at bigair.net.au>
>>>> wrote:
>>>>
>>>>  Hey All,
>>>>
>>>> Just saw the dashboards light up with connectivity issues
>>>> internationally for Vocus services and thought I'd make others aware.
>>>>
>>>> Vocus outage report is saying: "core network link between 59 Doody
>>>> Street, Alexandria and 55 South Market Street, San Jose has failed" which
>>>> hopefully isn't a Southern Cross fault!
>>>>
>>>> Anyone else seeing this or have more info?
>>>>
>>>>
>>>>
>>>> Cheers,
>>>>
>>>> Andrew
>>>>
>>>>
>>>> _______________________________________________
>>>> AusNOG mailing list
>>>> AusNOG at lists.ausnog.net
>>>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> AusNOG mailing list
>>>> AusNOG at lists.ausnog.net
>>>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Andrew Yager, Managing Director*   *MACS (Snr) CP BCompSc MCP*
>>>> Real World Technology Solutions Pty Ltd - IT people you can trust
>>>> ph: 1300 798 718 or (02) 9037 0500
>>>> fax: (02) 9037 0591
>>>> http://www.rwts.com.au/
>>>>
>>>> _______________________________________________
>>>> AusNOG mailing list
>>>> AusNOG at lists.ausnog.net
>>>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>>>
>>>
>>>
>>>
>>> --
>>> *Andrew Yager, Managing Director*   *MACS (Snr) CP BCompSc MCP*
>>> Real World Technology Solutions Pty Ltd - IT people you can trust
>>> ph: 1300 798 718 or (02) 9037 0500
>>> fax: (02) 9037 0591
>>> http://www.rwts.com.au/
>>>
>>> _______________________________________________
>>> AusNOG mailing list
>>> AusNOG at lists.ausnog.net
>>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>>
>>
>>
>>
>>
>>
>>
>>  _______________________________________________
>> AusNOG mailing list
>> AusNOG at lists.ausnog.net
>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>
>>
>> _______________________________________________
>> AusNOG mailing list
>> AusNOG at lists.ausnog.net
>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20140827/aed29b10/attachment.html>


More information about the AusNOG mailing list