[AusNOG] Telstra mobile down "nationwide"

TWIG Solutions support at twig.com.au
Thu Feb 11 10:55:05 EST 2016


Reminds me of the old DataTac network. Same thing occurred there, if and 
RNG failed, if the switch-over to the standby failed for some reason, 
the resulting registration storm caused the network to fall-over for 18 
hours.   I ended up writing an app to command userland devices to 
stagger/hold off registration based on (last digit of serial number x 
time in seconds).

Solved the issue nicely.



On 10/02/2016 2:37 PM, Shane Short wrote:
> So from what I've been reading apparently an engineer restarted an MME 
> without migrating the customers manually to another MME, which I 
> assume then booted a heap of them off the network? Apparently the 
> registration storm from that then overloaded the remaining MME's which 
> then caused all of the things to break?
>
> I hear they're blaming the engineer because he didn't migrate 
> customers over to another MME, but what if they had an actual failure 
> in the MME? It sounds like the network isn't designed to handle the 
> resulting authentication (over)load and completely falls over-- so as 
> much as the engineer triggered the fault, I'm not sure it's entirely 
> his fault that it then went to complete shit?
>
> (I'll admit I'm very green on the UTRAN/E-UTRA stuff, but it's 
> something I have immense curiosity about, if someone can correct me, 
> or point me in the right direction if I'm wrong here, I'd be much 
> appreciative)
>
> -Shane
>
> Clay Quinn wrote:
>>
>> Can confirm it was indeed an MME failure…
>>
>> Cheers
>>
>> Clay
>>
>> *From:*AusNOG [mailto:ausnog-bounces at lists.ausnog.net] *On Behalf Of 
>> *Narelle
>> *Sent:* Tuesday, 9 February 2016 6:23 PM
>> *To:* Joe Saxton <Joe.Saxton at workforce.com.au>
>> *Cc:* ausnog at lists.ausnog.net
>> *Subject:* Re: [AusNOG] Telstra mobile down "nationwide"
>>
>> Well, it is possible for a 4G HSS (HLR in the 3G world) to go down. 
>> That report however makes me think it might have been an MME... Given 
>> it was national, however, I'm thinking HSS. Though with millions of 
>> devices disconnecting and re-registering the traffic load cascades 
>> phenomenally and all sorts of fault behaviour will appear.
>>
>> Given the person in the interview didn't identify the "node", it 
>> still isn't clear exactly what went wrong at all. 
>> http://servicestatus.telstra.com/ doesn't really give enough clues at 
>> all.
>>
>> In the old 3G networks the RNC couldn't be set up in a redundant 
>> configuration, so if you didn't have enough, or one failed, you 
>> couldn't redirect traffic from all the connected base stations. Now 
>> you can with MMEs, and that would be consistent with this description.
>>
>> But - it sounds more like a call server issue (CSCF) if it is 
>> affecting some fixed networks. Also it is national. Call servers you 
>> deploy more centrally.
>>
>> Then again, the spokesperson also says "one of the nodes used to 
>> manage voice and data traffic between devices and the network started 
>> to malfunction" - so again I'm thinking MME...
>>
>> Narelle Clark
>>
>> PS - 000 is a mapping rather than something embedded in the system. 
>> You condition your network for that sort of local feature. Personally 
>> I'd always go for 112 on a mobile.
>>
>> On Tue, Feb 9, 2016 at 5:18 PM, Joe Saxton 
>> <Joe.Saxton at workforce.com.au <mailto:Joe.Saxton at workforce.com.au>> wrote:
>>
>>     It no doubt this would have been human error. With all the
>>     redundant systems in place, you just wonder an outage like this
>>     and this long more likely human error.
>>
>>     *From:*AusNOG [mailto:ausnog-bounces at lists.ausnog.net
>>     <mailto:ausnog-bounces at lists.ausnog.net>] *On Behalf Of *Cameron
>>     Murray
>>     *Sent:* Tuesday, 9 February 2016 4:45 PM
>>     *To:* James Gray <james at gray.net.au <mailto:james at gray.net.au>>
>>     *Cc:* ausnog at lists.ausnog.net <mailto:ausnog at lists.ausnog.net>
>>     *Subject:* Re: [AusNOG] Telstra mobile down "nationwide"
>>
>>     http://www.9news.com.au/technology/2016/02/09/13/11/reports-of-telstra-mobile-services-outage
>>
>>     Keep backing up the bus...
>>
>>     On Tue, Feb 9, 2016 at 2:54 PM, James Gray <james at gray.net.au
>>     <mailto:james at gray.net.au>> wrote:
>>
>>         In addition to the traditional swiss-army knife known as
>>         "telnet", getting "curl" to dump just the HTTP headers is
>>         also a handy one to keep up your sleeve:
>>
>>         0:>*curl -I http://triplezero.com.au*
>>
>>         HTTP/1.1 200 OK
>>
>>         Connection: close
>>
>>         Date: Tue, 09 Feb 2016 04:45:18 GMT
>>
>>         Server: Microsoft-IIS/6.0
>>
>>         X-Powered-By: ASP.NET <http://ASP.NET>
>>
>>         Content-Type: text/html; charset=UTF-8
>>
>>         Also telnet wont work with SSL sites (ie, https), but if you
>>         have openssl installed, you can do this instead:
>>
>>         0:>*openssl s_client -quiet -connect www.google.com:443
>>         <http://www.google.com:443>*
>>
>>         The openssl method also works on other SSL-enabled service
>>         like POP3S and IMAPS etc. I have it aliased in my shell config:
>>         *alias stelnet="openssl s_client -quiet -connect"*
>>
>>         ...then all I need to do is: "stelnet host:port"
>>
>>         Just good to have tucked away in case you need to break out
>>         the command-line hammer.
>>
>>         Cheers,
>>
>>         James
>>
>>         On 9 February 2016 at 15:15, Ross Wheeler <ausnog at rossw.net
>>         <mailto:ausnog at rossw.net>> wrote:
>>
>>
>>
>>             On Tue, 9 Feb 2016, Shane Chrisp wrote:
>>
>>                 Yep, that was just my fat fingers. I am trying to get
>>                 to the triplezero.com.au <http://triplezero.com.au>
>>                 but no go.
>>
>>                 traceroute to www.triplezero.gov.au
>>                 <http://www.triplezero.gov.au> (115.178.104.72), 30
>>                 hops max, 60 byte
>>
>>             ...
>>
>>                 11 bundle-ether2.civ.core2.canberra.telstra.net
>>                 <http://bundle-ether2.civ.core2.canberra.telstra.net>
>>                 (203.50.6.82)  71.834 ms 71.845 ms  70.450 ms
>>                 12 Bundle-Ethernet1.civ-edge901.canberra.telstra.net
>>                 <http://Bundle-Ethernet1.civ-edge901.canberra.telstra.net>
>>                 (203.50.8.35) 69.126 ms 69.138 ms  69.042 ms
>>                 13 telstr1248.lnk.telstra.net
>>                 <http://telstr1248.lnk.telstra.net> (165.228.21.206) 
>>                 68.991 ms 68.981 ms 68.926 ms
>>                 14  * * *
>>                 15  * * *
>>                 16  * * *
>>
>>
>>
>>             Traceroute is only one tool in a toolbox, and frequently
>>             not as helpful as you might hope.
>>
>>              5 bundle-ether2.chw-edge902.sydney.telstra.net
>>             <http://bundle-ether2.chw-edge902.sydney.telstra.net>
>>             (203.50.11.105)  2.002 ms
>>              6 bundle-ether2.dkn-core1.canberra.telstra.net
>>             <http://bundle-ether2.dkn-core1.canberra.telstra.net>
>>             (203.50.6.129)  8.918 ms
>>              7 bundle-ether2.civ.core2.canberra.telstra.net
>>             <http://bundle-ether2.civ.core2.canberra.telstra.net>
>>             (203.50.6.82)  9.334 ms
>>              8 Bundle-Ethernet1.civ-edge901.canberra.telstra.net
>>             <http://Bundle-Ethernet1.civ-edge901.canberra.telstra.net> (203.50.8.35)
>>             7.822 ms
>>              9 telstr1248.lnk.telstra.net
>>             <http://telstr1248.lnk.telstra.net> (165.228.21.206) 
>>             9.189 ms
>>             10  *
>>             11  *
>>
>>             Yes, it appear to not be reachable....
>>
>>             However using another tool  it clearly IS working....
>>
>>             # telnet triplezero.gov.au <http://triplezero.gov.au> 80
>>             Trying 2403:d500::48...
>>             telnet: connect to address 2403:d500::48: No route to host
>>             Trying 115.178.104.72...
>>             Connected to triplezero.gov.au <http://triplezero.gov.au>.
>>             Escape character is '^]'.
>>             GET / http/1.0
>>
>>             HTTP/1.1 403 Forbidden
>>             Cache-Control: no-cache
>>
>>
>>
>>             There's also a hint there....
>>
>>
>>             _______________________________________________
>>             AusNOG mailing list
>>             AusNOG at lists.ausnog.net <mailto:AusNOG at lists.ausnog.net>
>>             http://lists.ausnog.net/mailman/listinfo/ausnog
>>
>>
>>         _______________________________________________
>>         AusNOG mailing list
>>         AusNOG at lists.ausnog.net <mailto:AusNOG at lists.ausnog.net>
>>         http://lists.ausnog.net/mailman/listinfo/ausnog
>>
>>     ------------------------------------------------------------------------
>>
>>     *Note:*
>>
>>     This message is for the named person's use only.  It may contain
>>     confidential, proprietary or legally privileged information.  No
>>     confidentiality or privilege is waived or lost by any
>>     mistransmission.  If you receive this message in error, please
>>     immediately delete it and all copies of it from your system,
>>     destroy any hard copies of it and notify the sender.  You must
>>     not, directly or indirectly, use, disclose, distribute, print, or
>>     copy any part of this message if you are not the intended
>>     recipient. *Workforce International Pty Ltd***and any of its
>>     subsidiaries each reserve the right to monitor all e-mail
>>     communications through its networks. Any views expressed in this
>>     message are those of the individual sender, except where the
>>     message states otherwise and the sender is authorised to state
>>     them to be the views of any such entity.
>>
>>     ------------------------------------------------------------------------
>>
>>
>>     _______________________________________________
>>     AusNOG mailing list
>>     AusNOG at lists.ausnog.net <mailto:AusNOG at lists.ausnog.net>
>>     http://lists.ausnog.net/mailman/listinfo/ausnog
>>
>>
>>
>> -- 
>>
>>
>>
>> Narelle
>> narellec at gmail.com <mailto:narellec at gmail.com>
>>
>> E-Banner <http://www.mrv.com/blog>
>>
>> MRV Communications is a global supplier of packet and optical 
>> solutions that power the world’s largest networks. Our products 
>> combine innovative hardware with intelligent software to make 
>> networks smarter, faster and more efficient.
>>
>>
>> The contents of this message, together with any attachments, are 
>> intended only for the use of the person(s) to whom they are addressed 
>> and may contain confidential and/or privileged information. If you 
>> are not the intended recipient, immediately advise the sender, delete 
>> this message and any attachments and note that any distribution, or 
>> copying of this message, or any attachment, is prohibited.
>>
>> _______________________________________________
>> AusNOG mailing list
>> AusNOG at lists.ausnog.net
>> http://lists.ausnog.net/mailman/listinfo/ausnog
>
>
>
> _______________________________________________
> AusNOG mailing list
> AusNOG at lists.ausnog.net
> http://lists.ausnog.net/mailman/listinfo/ausnog

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20160211/89f770e2/attachment.html>


More information about the AusNOG mailing list