[AusNOG] SPAM-MED: Re: Vocus international service outage

Kristoffer Sheather @ CloudCentral kristoffer.sheather at cloudcentral.com.au
Wed Aug 27 00:29:18 EST 2014


Yep :)
  

----------------------------------------
 From: "Skeeve Stevens" <skeeve+ausnog at eintellegonetworks.com>
Sent: Wednesday, August 27, 2014 12:24 AM
To: "Wolfgang Nagele (AusRegistry)" <wolfgang.nagele at ausregistry.com.au>
Cc: "Kristoffer Sheather" <kris at cloudcentral.com.au>, "James Spenceley" <james at iroute.org>, "Ausnog at ausnog.net" <ausnog at ausnog.net>
Subject: Re: [AusNOG] SPAM-MED: Re: Vocus international service outage   
 Actually no.. Telstra has had issues like this.  Dodo was dead for many more hours the other day... much traffic here? Nope.  
 Global problems are a fact of like as a service provider.
  
 I've always said SLA's are meaningless marketing documents.... get back a few bucks - maybe...   You need to plan your own networks to survive the regular problems that occur on this Internet thing.  Failing to plan means you are planning to fail.
  
 You seemed to have suffered, I acknowledge that... and you want answers.... but those answers are unlikely to change anything.  You are not entitled to be privy to the network engineering issues of any provider apart from yourself.
  
 There was a screw-up.. Are people curious? Sure... but those of us who have been around for a long time, actually assume that someone has had their ass kicked (if it was a person or design fault), or if it was just a cascading error, then the situation will be noted, and if the risks of it happening again are high, they will plan for it.  If it is a perfect storm, then .. deal with it... get over it.. move on.  I wouldn't expect anyone, not even Telstra to go spending up big to deal with a random set of events.  We learn from these things.
  
 If you have an issue with Vocus, talk to you account manager... go to a different provider, but I am not sure what wandering around with a placard in public asking what happened - after the fact, helps in any way?  Shit happened... everyone knows it..

     
...Skeeve
  
  Skeeve Stevens - eintellego Networks Pty Ltd
  skeeve at eintellegonetworks.com ; www.eintellegonetworks.com  

Phone: 1300 239 038; Cell +61 (0)414 753 383 ; skype://skeeve   

facebook.com/eintellegonetworks ; linkedin.com/in/skeeve       

twitter.com/theispguy ; blog: www.theispguy.com   

The Experts Who The Experts Call    
 Juniper - Cisco - Cloud - Consulting - IPv4 Brokering

   On 26 August 2014 23:58, Wolfgang Nagele (AusRegistry) <wolfgang.nagele at ausregistry.com.au> wrote:    Hi,
  
 This is your choice. Not mine and I am sure there are others here that do not agree with your idea of brushing issues like this off the table. If this were Telstra you would be striking a different tone.
  
 It's irrelevant if you have multiple upstreams - there are minimum requirements that we put on suppliers in the year 2014. Having Vocus suffer a 4 hour degradation due to a single DC fault in SJC is not something that we accept in 2014. There is redundancy via LA (and Singapore) which didn't work. I would like to know why - not why it didn't fail over automatically there can be many reasons. Vocus engineers should have been able to re-route via LA and completely take SJC out of the equation. There are questions to be answered here. As well as delays in notifications - we have not received a notification for over an hour. Again not acceptable for an incident of that magnitude.
  
 We all learn based on mistakes - ignoring them gains nothing.
  
 As for multiple upstreams, yes we have them and yes we routed around the issue. To me that's irrelevant to the issue at hand. If you are happy with a supplier that has 4 hour degradation due to a single DC fault on it's main international backhaul - yes - move along, nothing to see here.
  
 Cheers,
 Wolfgang
  
  On 8/26/14, 11:40 PM, "Skeeve Stevens" <skeeve+ausnog at eintellegonetworks.com> wrote:

  
    Yup.. move along, nothing to see here.    
   Once an outage is fixed, those who dwell on the cause that they can do nothing about, are focusing in the wrong place.
    
   If your only transit was through a single upstream, that is where you should be focusing, not the provider.

     
...Skeeve
  
  Skeeve Stevens - eintellego Networks Pty Ltd
  skeeve at eintellegonetworks.com ; www.eintellegonetworks.com    

Phone: 1300 239 038; Cell +61 (0)414 753 383 ; skype://skeeve      

facebook.com/eintellegonetworks ; linkedin.com/in/skeeve       

twitter.com/theispguy ; blog: www.theispguy.com      

The Experts Who The Experts Call    
 Juniper - Cisco - Cloud - Consulting - IPv4 Brokering

   On 26 August 2014 23:19, Kristoffer Sheather @ CloudCentral <kristoffer.sheather at cloudcentral.com.au> wrote:   Shit broke, they fixed.
  
 <EOM />
  

----------------------------------------
 From: "Wolfgang Nagele (AusRegistry)" <wolfgang.nagele at ausregistry.com.au>
Sent: Tuesday, August 26, 2014 11:17 PM
To: "James Spenceley" <james at iroute.org>
Cc: "Ausnog at ausnog.net" <ausnog at ausnog.net>
Subject: SPAM-MED: Re: [AusNOG] Vocus international service outage    
   Hi James,
    
   Still waiting for the RfO on this whole thing with the details. Neither seen one here nor as a follow-up to customer notifications.
    
   Cheers,
   Wolfgang
    
    On 8/24/14, 12:38 AM, "Wolfgang Nagele (AusRegistry)" <wolfgang.nagele at ausregistry.com.au> wrote:

    
      Hi James,
  
 Hmm - can understand that but would have expected that there is sufficient redundancy in the LA landing of your network. I would have expected that a re-route and taking SJC largely out of the equation would be possible. Surprised to say the least .
  
 Cheers,
 Wolfgang
  
  On 8/24/14, 12:10 AM, "James Spenceley" <james at iroute.org> wrote:

  
    Early mail is a power surge in a US DC has damaged both core routers. Wouldn't surprise me if transport from other providers out of that building will be having similar issues. 
  
 Circuits are being moved directly to borders as we speak.
  

Sent from my iPhone

On 24 Aug 2014, at 0:00, Jared Hirst <jared.hirst at serversaustralia.com.au> wrote:
 
    WOW.... It has taken 2 hours to get remote hands to the DC with what seems to be a device with no redundancy?
  

 	 		 			2014/08/23 13:55
			UTC	  			

Engineers are currently awaiting remote support in the US. Links will be physically moved from the failed device in order to restore services on an alternate device. 				  		

     On Sat, Aug 23, 2014 at 11:29 PM, Andrew Yager <andrew at rwts.com.au> wrote:     [hijacking the thread.]
  
They say there is a big rewrite coming on the way rpd and sampled interact in 14.2; and the slow convergance issues have been fixed in more releases than I care to remember right now. but people say they are pretty good in the 12.3r6 train. Our MX80's are slated for upgrade to that at some stage in the next few months.      
   Some noise about this on j-nsp again today.    
   Andrew      
  

       On 23 August 2014 23:23, Jonathan Thorpe  <jthorpe at conexim.com.au> wrote:      

True, but they otherwise work exceptionally well.  

   

I'm not sure what kind of PowerPC CPU is doing all the work on an MX80's RE, but I do sometimes wonder if the CPU on a <$40 Raspberry Pi might be more up to the job :-P  

     

From: Tony Wicks [mailto:tony at wicks.co.nz]
Sent: Saturday, 23 August 2014 11:13 PM
To: Jonathan Thorpe
Cc: 'Ausnog at ausnog.net'
Subject: RE: [AusNOG] Vocus international service outage 

   

Well, if you buy the big chassis boxes...  

     

From: AusNOG [mailto:ausnog-bounces at lists.ausnog.net] On Behalf Of Jonathan Thorpe
Sent: Sunday, 24 August 2014 1:10 a.m.
To: Andrew Yager; Jared Hirst
Cc: Ausnog at ausnog.net
Subject: Re: [AusNOG] Vocus international service outage 

   

Glad I'm not the only one holding my breath on our MXs :)  

   

From: AusNOG [mailto:ausnog-bounces at lists.ausnog.net] On Behalf Of Andrew Yager
Sent: Saturday, 23 August 2014 10:55 PM
To: Jared Hirst
Cc: Ausnog at ausnog.net
Subject: Re: [AusNOG] Vocus international service outage  

    

We've done the same (about 20 minutes ago).   

  

Right now I hate how long Juniper MX's take to stabilise their routing table with sampling on. 

  

Andrew 

    

On 23 August 2014 22:41, Jared Hirst <jared.hirst at serversaustralia.com.au> wrote:     

We have just turned Vocus off. Using other providers for now, as the flapping is causing it to go up and down. 

    

On Sat, Aug 23, 2014 at 10:37 PM, Daniel Watson <Daniel at glovine.com.au> wrote:     

Indeed seeing some big drops in gaming traffic at present, normally we see above 60mbit on weekends at evenings, but not even seeing 40mbit at present :S  

   

   

Regards,  

Daniel Watson  

Network Administrator / Network Operations Manager  

   

E Daniel at GloVine.com.au  

W www.GloVine.com.au  

   

From: AusNOG [mailto:ausnog-bounces at lists.ausnog.net] On Behalf Of Jared Hirst
Sent: Saturday, 23 August 2014 10:35 PM   

To: Andrew Cox
Cc: Ausnog at ausnog.net
Subject: Re: [AusNOG] Vocus international service outage 

     

Yeah we are seeing this! Everything running via them is flapping, they claim to have 'routed around it' but that's not the case. Very frustrating as it's been an hour and no one there seems to know whats going on.... 

    

On Sat, Aug 23, 2014 at 10:29 PM, Andrew Cox <andrew.cox at bigair.net.au> wrote:    

Hey All,

Just saw the dashboards light up with connectivity issues internationally for Vocus services and thought I'd make others aware.

Vocus outage report is saying: "core network link between 59 Doody Street, Alexandria and 55 South Market Street, San Jose has failed" which hopefully isn't a Southern Cross fault!

Anyone else seeing this or have more info?   

  

Cheers, 

Andrew 

_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net
http://lists.ausnog.net/mailman/listinfo/ausnog  

    

  

--    

  

  

_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net
http://lists.ausnog.net/mailman/listinfo/ausnog  

    

  

--
Andrew Yager, Managing Director   MACS (Snr) CP BCompSc MCP
Real World Technology Solutions Pty Ltd - IT people you can trust
ph: 1300 798 718 or (02) 9037 0500
fax: (02) 9037 0591
http://www.rwts.com.au/ 

_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net
http://lists.ausnog.net/mailman/listinfo/ausnog
   
      
 --
Andrew Yager, Managing Director   MACS (Snr) CP BCompSc MCP
Real World Technology Solutions Pty Ltd - IT people you can trust
ph: 1300 798 718 or (02) 9037 0500
fax: (02) 9037 0591
http://www.rwts.com.au/

_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net
http://lists.ausnog.net/mailman/listinfo/ausnog
   
      
      

   _______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net
http://lists.ausnog.net/mailman/listinfo/ausnog

_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net
http://lists.ausnog.net/mailman/listinfo/ausnog
   


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20140827/12506fb8/attachment-0001.html>


More information about the AusNOG mailing list