[AusNOG] Telstra mobile down "nationwide"

James Morgan james.morgan at vernet.com.au
Wed Feb 10 14:43:54 EST 2016


Many internally wonder why he would have done that during the day given this is expected to happen.

From: AusNOG [mailto:ausnog-bounces at lists.ausnog.net] On Behalf Of Shane Short
Sent: Wednesday, 10 February 2016 2:37 PM
To: Clay Quinn <cquinn at mrv.com>
Cc: ausnog at lists.ausnog.net
Subject: Re: [AusNOG] Telstra mobile down "nationwide"

So from what I've been reading apparently an engineer restarted an MME without migrating the customers manually to another MME, which I assume then booted a heap of them off the network? Apparently the registration storm from that then overloaded the remaining MME's which then caused all of the things to break?

I hear they're blaming the engineer because he didn't migrate customers over to another MME, but what if they had an actual failure in the MME? It sounds like the network isn't designed to handle the resulting authentication (over)load and completely falls over-- so as much as the engineer triggered the fault, I'm not sure it's entirely his fault that it then went to complete shit?

(I'll admit I'm very green on the UTRAN/E-UTRA stuff, but it's something I have immense curiosity about, if someone can correct me, or point me in the right direction if I'm wrong here, I'd be much appreciative)

-Shane

Clay Quinn wrote:

Can confirm it was indeed an MME failure…

Cheers
Clay

From: AusNOG [mailto:ausnog-bounces at lists.ausnog.net] On Behalf Of Narelle
Sent: Tuesday, 9 February 2016 6:23 PM
To: Joe Saxton <Joe.Saxton at workforce.com.au><mailto:Joe.Saxton at workforce.com.au>
Cc: ausnog at lists.ausnog.net<mailto:ausnog at lists.ausnog.net>
Subject: Re: [AusNOG] Telstra mobile down "nationwide"


Well, it is possible for a 4G HSS (HLR in the 3G world) to go down. That report however makes me think it might have been an MME... Given it was national, however, I'm thinking HSS. Though with millions of devices disconnecting and re-registering the traffic load cascades phenomenally and all sorts of fault behaviour will appear.

Given the person in the interview didn't identify the "node", it still isn't clear exactly what went wrong at all. http://servicestatus.telstra.com/ doesn't really give enough clues at all.

In the old 3G networks the RNC couldn't be set up in a redundant configuration, so if you didn't have enough, or one failed, you couldn't redirect traffic from all the connected base stations. Now you can with MMEs, and that would be consistent with this description.

But - it sounds more like a call server issue (CSCF) if it is affecting some fixed networks. Also it is national. Call servers you deploy more centrally.

Then again, the spokesperson also says "one of the nodes used to manage voice and data traffic between devices and the network started to malfunction" - so again I'm thinking MME...


Narelle Clark


PS - 000 is a mapping rather than something embedded in the system. You condition your network for that sort of local feature. Personally I'd always go for 112 on a mobile.


On Tue, Feb 9, 2016 at 5:18 PM, Joe Saxton <Joe.Saxton at workforce.com.au<mailto:Joe.Saxton at workforce.com.au>> wrote:
It no doubt this would have been human error. With all the redundant systems in place, you just wonder an outage like this and this long more likely human error.

From: AusNOG [mailto:ausnog-bounces at lists.ausnog.net<mailto:ausnog-bounces at lists.ausnog.net>] On Behalf Of Cameron Murray
Sent: Tuesday, 9 February 2016 4:45 PM
To: James Gray <james at gray.net.au<mailto:james at gray.net.au>>
Cc: ausnog at lists.ausnog.net<mailto:ausnog at lists.ausnog.net>
Subject: Re: [AusNOG] Telstra mobile down "nationwide"

http://www.9news.com.au/technology/2016/02/09/13/11/reports-of-telstra-mobile-services-outage

Keep backing up the bus...

On Tue, Feb 9, 2016 at 2:54 PM, James Gray <james at gray.net.au<mailto:james at gray.net.au>> wrote:
In addition to the traditional swiss-army knife known as "telnet", getting "curl" to dump just the HTTP headers is also a handy one to keep up your sleeve:
0:>curl -I http://triplezero.com.au
HTTP/1.1 200 OK
Connection: close
Date: Tue, 09 Feb 2016 04:45:18 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET<http://ASP.NET>
Content-Type: text/html; charset=UTF-8

Also telnet wont work with SSL sites (ie, https), but if you have openssl installed, you can do this instead:
0:>openssl s_client -quiet -connect www.google.com:443<http://www.google.com:443>

The openssl method also works on other SSL-enabled service like POP3S and IMAPS etc. I have it aliased in my shell config:
alias stelnet="openssl s_client -quiet -connect"

...then all I need to do is: "stelnet host:port"

Just good to have tucked away in case you need to break out the command-line hammer.

Cheers,

James

On 9 February 2016 at 15:15, Ross Wheeler <ausnog at rossw.net<mailto:ausnog at rossw.net>> wrote:


On Tue, 9 Feb 2016, Shane Chrisp wrote:
Yep, that was just my fat fingers. I am trying to get to the triplezero.com.au<http://triplezero.com.au> but no go.

traceroute to www.triplezero.gov.au<http://www.triplezero.gov.au> (115.178.104.72), 30 hops max, 60 byte
...
11  bundle-ether2.civ.core2.canberra.telstra.net<http://bundle-ether2.civ.core2.canberra.telstra.net> (203.50.6.82)  71.834 ms 71.845 ms  70.450 ms
12  Bundle-Ethernet1.civ-edge901.canberra.telstra.net<http://Bundle-Ethernet1.civ-edge901.canberra.telstra.net> (203.50.8.35) 69.126 ms 69.138 ms  69.042 ms
13  telstr1248.lnk.telstra.net<http://telstr1248.lnk.telstra.net> (165.228.21.206)  68.991 ms  68.981 ms 68.926 ms
14  * * *
15  * * *
16  * * *


Traceroute is only one tool in a toolbox, and frequently not as helpful as you might hope.

 5  bundle-ether2.chw-edge902.sydney.telstra.net<http://bundle-ether2.chw-edge902.sydney.telstra.net> (203.50.11.105)  2.002 ms
 6  bundle-ether2.dkn-core1.canberra.telstra.net<http://bundle-ether2.dkn-core1.canberra.telstra.net> (203.50.6.129)  8.918 ms
 7  bundle-ether2.civ.core2.canberra.telstra.net<http://bundle-ether2.civ.core2.canberra.telstra.net> (203.50.6.82)  9.334 ms
 8  Bundle-Ethernet1.civ-edge901.canberra.telstra.net<http://Bundle-Ethernet1.civ-edge901.canberra.telstra.net> (203.50.8.35)  7.822 ms
 9  telstr1248.lnk.telstra.net<http://telstr1248.lnk.telstra.net> (165.228.21.206)  9.189 ms
10  *
11  *

Yes, it appear to not be reachable....

However using another tool  it clearly IS working....

# telnet triplezero.gov.au<http://triplezero.gov.au> 80
Trying 2403:d500::48...
telnet: connect to address 2403:d500::48: No route to host
Trying 115.178.104.72...
Connected to triplezero.gov.au<http://triplezero.gov.au>.
Escape character is '^]'.
GET / http/1.0

HTTP/1.1 403 Forbidden
Cache-Control: no-cache



There's also a hint there....

_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net<mailto:AusNOG at lists.ausnog.net>
http://lists.ausnog.net/mailman/listinfo/ausnog


_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net<mailto:AusNOG at lists.ausnog.net>
http://lists.ausnog.net/mailman/listinfo/ausnog

________________________________
Note:
This message is for the named person's use only.  It may contain confidential, proprietary or legally privileged information.  No confidentiality or privilege is waived or lost by any mistransmission.  If you receive this message in error, please immediately delete it and all copies of it from your system, destroy any hard copies of it and notify the sender.  You must not, directly or indirectly, use, disclose, distribute, print, or copy any part of this message if you are not the intended recipient. Workforce International Pty Ltd and any of its subsidiaries each reserve the right to monitor all e-mail communications through its networks. Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of any such entity.

________________________________

_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net<mailto:AusNOG at lists.ausnog.net>
http://lists.ausnog.net/mailman/listinfo/ausnog



--


Narelle
narellec at gmail.com<mailto:narellec at gmail.com>
[E-Banner]<http://www.mrv.com/blog>


MRV Communications is a global supplier of packet and optical solutions that power the world’s largest networks. Our products combine innovative hardware with intelligent software to make networks smarter, faster and more efficient.


The contents of this message, together with any attachments, are intended only for the use of the person(s) to whom they are addressed and may contain confidential and/or privileged information. If you are not the intended recipient, immediately advise the sender, delete this message and any attachments and note that any distribution, or copying of this message, or any attachment, is prohibited.

_______________________________________________

AusNOG mailing list

AusNOG at lists.ausnog.net<mailto:AusNOG at lists.ausnog.net>

http://lists.ausnog.net/mailman/listinfo/ausnog

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20160210/0e2778a3/attachment.html>


More information about the AusNOG mailing list