[AusNOG] Telstra mobile down "nationwide"

Damien Gardner Jnr rendrag at rendrag.net
Wed Feb 10 14:56:34 EST 2016


That would make sense, as my personal phone, and the eftpos terminal in the
store I was in at the time lost Telstra, where as my work phone was working
fine.  When I got back to work, some folks had service, some didn't - and
after rebooting my personal phone, it got service back also.

On 10 February 2016 at 14:54, Tim Raphael <raphael.timothy at gmail.com> wrote:

> I would agree with this,
>
> Various releases noted that 10-15% of customers were affected as it was
> not complete network outage. This sounds a lot like the behaviour you would
> expect from a "faulty node in the cluster" type scenario.
>
> - Tim
>
> On Wed, Feb 10, 2016 at 11:44 AM, Clay Quinn <cquinn at mrv.com> wrote:
>
>> Apparently customers were still being routed through the faulty node,
>> essentially the equivalent of having a dead server in a load-balanced
>> cluster (which results in a percentage of connections not establishing).
>> So it sounds like the faulty MME was bought back into service
>> (accidentally).  I’m not an expect, this is just the explanation offered to
>> me by a Telstra employee familiar with the situation…
>>
>>
>>
>> *From:* Shane Short [mailto:shane at short.id.au]
>> *Sent:* Wednesday, 10 February 2016 2:37 PM
>> *To:* Clay Quinn <cquinn at mrv.com>
>> *Cc:* ausnog at lists.ausnog.net
>> *Subject:* Re: [AusNOG] Telstra mobile down "nationwide"
>>
>>
>>
>> So from what I've been reading apparently an engineer restarted an MME
>> without migrating the customers manually to another MME, which I assume
>> then booted a heap of them off the network? Apparently the registration
>> storm from that then overloaded the remaining MME's which then caused all
>> of the things to break?
>>
>> I hear they're blaming the engineer because he didn't migrate customers
>> over to another MME, but what if they had an actual failure in the MME? It
>> sounds like the network isn't designed to handle the resulting
>> authentication (over)load and completely falls over-- so as much as the
>> engineer triggered the fault, I'm not sure it's entirely his fault that it
>> then went to complete shit?
>>
>> (I'll admit I'm very green on the UTRAN/E-UTRA stuff, but it's something
>> I have immense curiosity about, if someone can correct me, or point me in
>> the right direction if I'm wrong here, I'd be much appreciative)
>>
>> -Shane
>>
>> Clay Quinn wrote:
>>
>> Can confirm it was indeed an MME failure…
>>
>>
>>
>> Cheers
>>
>> Clay
>>
>>
>>
>> *From:* AusNOG [mailto:ausnog-bounces at lists.ausnog.net
>> <ausnog-bounces at lists.ausnog.net>] *On Behalf Of *Narelle
>> *Sent:* Tuesday, 9 February 2016 6:23 PM
>> *To:* Joe Saxton <Joe.Saxton at workforce.com.au>
>> <Joe.Saxton at workforce.com.au>
>> *Cc:* ausnog at lists.ausnog.net
>> *Subject:* Re: [AusNOG] Telstra mobile down "nationwide"
>>
>>
>>
>>
>>
>> Well, it is possible for a 4G HSS (HLR in the 3G world) to go down. That
>> report however makes me think it might have been an MME... Given it was
>> national, however, I'm thinking HSS. Though with millions of devices
>> disconnecting and re-registering the traffic load cascades phenomenally and
>> all sorts of fault behaviour will appear.
>>
>>
>>
>> Given the person in the interview didn't identify the "node", it still
>> isn't clear exactly what went wrong at all.
>> http://servicestatus.telstra.com/ doesn't really give enough clues at
>> all.
>>
>>
>>
>> In the old 3G networks the RNC couldn't be set up in a redundant
>> configuration, so if you didn't have enough, or one failed, you couldn't
>> redirect traffic from all the connected base stations. Now you can with
>> MMEs, and that would be consistent with this description.
>>
>>
>>
>> But - it sounds more like a call server issue (CSCF) if it is affecting
>> some fixed networks. Also it is national. Call servers you deploy more
>> centrally.
>>
>>
>>
>> Then again, the spokesperson also says "one of the nodes used to manage
>> voice and data traffic between devices and the network started to
>> malfunction" - so again I'm thinking MME...
>>
>>
>>
>>
>>
>> Narelle Clark
>>
>>
>>
>>
>>
>> PS - 000 is a mapping rather than something embedded in the system. You
>> condition your network for that sort of local feature. Personally I'd
>> always go for 112 on a mobile.
>>
>>
>>
>>
>>
>> On Tue, Feb 9, 2016 at 5:18 PM, Joe Saxton <Joe.Saxton at workforce.com.au>
>> wrote:
>>
>> It no doubt this would have been human error. With all the redundant
>> systems in place, you just wonder an outage like this and this long more
>> likely human error.
>>
>>
>>
>> *From:* AusNOG [mailto:ausnog-bounces at lists.ausnog.net] *On Behalf Of *Cameron
>> Murray
>> *Sent:* Tuesday, 9 February 2016 4:45 PM
>> *To:* James Gray <james at gray.net.au>
>> *Cc:* ausnog at lists.ausnog.net
>> *Subject:* Re: [AusNOG] Telstra mobile down "nationwide"
>>
>>
>>
>>
>> http://www.9news.com.au/technology/2016/02/09/13/11/reports-of-telstra-mobile-services-outage
>>
>>
>>
>> Keep backing up the bus...
>>
>>
>>
>> On Tue, Feb 9, 2016 at 2:54 PM, James Gray <james at gray.net.au> wrote:
>>
>> In addition to the traditional swiss-army knife known as "telnet",
>> getting "curl" to dump just the HTTP headers is also a handy one to keep up
>> your sleeve:
>>
>> 0:>*curl -I http://triplezero.com.au <http://triplezero.com.au>*
>>
>> HTTP/1.1 200 OK
>>
>> Connection: close
>>
>> Date: Tue, 09 Feb 2016 04:45:18 GMT
>>
>> Server: Microsoft-IIS/6.0
>>
>> X-Powered-By: ASP.NET
>>
>> Content-Type: text/html; charset=UTF-8
>>
>>
>>
>> Also telnet wont work with SSL sites (ie, https), but if you have openssl
>> installed, you can do this instead:
>>
>> 0:>*openssl s_client -quiet -connect www.google.com:443
>> <http://www.google.com:443>*
>>
>>
>>
>> The openssl method also works on other SSL-enabled service like POP3S and
>> IMAPS etc. I have it aliased in my shell config:
>> *alias stelnet="openssl s_client -quiet -connect"*
>>
>>
>>
>> ...then all I need to do is: "stelnet host:port"
>>
>>
>>
>> Just good to have tucked away in case you need to break out the
>> command-line hammer.
>>
>>
>>
>> Cheers,
>>
>>
>>
>> James
>>
>>
>>
>> On 9 February 2016 at 15:15, Ross Wheeler <ausnog at rossw.net> wrote:
>>
>>
>>
>> On Tue, 9 Feb 2016, Shane Chrisp wrote:
>>
>> Yep, that was just my fat fingers. I am trying to get to the
>> triplezero.com.au but no go.
>>
>>
>>
>> traceroute to www.triplezero.gov.au (115.178.104.72), 30 hops max, 60
>> byte
>>
>> ...
>>
>> 11  bundle-ether2.civ.core2.canberra.telstra.net (203.50.6.82)  71.834
>> ms 71.845 ms  70.450 ms
>> 12  Bundle-Ethernet1.civ-edge901.canberra.telstra.net (203.50.8.35)
>> 69.126 ms 69.138 ms  69.042 ms
>> 13  telstr1248.lnk.telstra.net (165.228.21.206)  68.991 ms  68.981 ms
>> 68.926 ms
>> 14  * * *
>> 15  * * *
>> 16  * * *
>>
>>
>>
>> Traceroute is only one tool in a toolbox, and frequently not as helpful
>> as you might hope.
>>
>>  5  bundle-ether2.chw-edge902.sydney.telstra.net (203.50.11.105)  2.002
>> ms
>>  6  bundle-ether2.dkn-core1.canberra.telstra.net (203.50.6.129)  8.918 ms
>>  7  bundle-ether2.civ.core2.canberra.telstra.net (203.50.6.82)  9.334 ms
>>  8  Bundle-Ethernet1.civ-edge901.canberra.telstra.net (203.50.8.35)
>> 7.822 ms
>>  9  telstr1248.lnk.telstra.net (165.228.21.206)  9.189 ms
>> 10  *
>> 11  *
>>
>> Yes, it appear to not be reachable....
>>
>> However using another tool  it clearly IS working....
>>
>> # telnet triplezero.gov.au 80
>> Trying 2403:d500::48...
>> telnet: connect to address 2403:d500::48: No route to host
>> Trying 115.178.104.72...
>> Connected to triplezero.gov.au.
>> Escape character is '^]'.
>> GET / http/1.0
>>
>> HTTP/1.1 403 Forbidden
>> Cache-Control: no-cache
>>
>>
>>
>> There's also a hint there....
>>
>>
>> _______________________________________________
>> AusNOG mailing list
>> AusNOG at lists.ausnog.net
>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>
>>
>>
>>
>> _______________________________________________
>> AusNOG mailing list
>> AusNOG at lists.ausnog.net
>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>
>>
>> ------------------------------
>>
>> *Note:*
>>
>> This message is for the named person's use only.  It may contain
>> confidential, proprietary or legally privileged information.  No
>> confidentiality or privilege is waived or lost by any mistransmission.  If
>> you receive this message in error, please immediately delete it and all
>> copies of it from your system, destroy any hard copies of it and notify the
>> sender.  You must not, directly or indirectly, use, disclose, distribute,
>> print, or copy any part of this message if you are not the intended
>> recipient. *Workforce International Pty Ltd* and any of its subsidiaries
>> each reserve the right to monitor all e-mail communications through its
>> networks. Any views expressed in this message are those of the individual
>> sender, except where the message states otherwise and the sender is
>> authorised to state them to be the views of any such entity.
>>
>>
>> ------------------------------
>>
>>
>> _______________________________________________
>> AusNOG mailing list
>> AusNOG at lists.ausnog.net
>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>
>>
>>
>>
>>
>> --
>>
>>
>>
>> Narelle
>> narellec at gmail.com
>>
>> [image: E-Banner] <http://www.mrv.com/blog>
>>
>>
>> MRV Communications is a global supplier of packet and optical solutions
>> that power the world’s largest networks. Our products combine innovative
>> hardware with intelligent software to make networks smarter, faster and
>> more efficient.
>>
>>
>>
>> The contents of this message, together with any attachments, are intended
>> only for the use of the person(s) to whom they are addressed and may
>> contain confidential and/or privileged information. If you are not the
>> intended recipient, immediately advise the sender, delete this message and
>> any attachments and note that any distribution, or copying of this message,
>> or any attachment, is prohibited.
>>
>> _______________________________________________
>>
>> AusNOG mailing list
>>
>> AusNOG at lists.ausnog.net
>>
>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>
>>
>> [image: E-Banner] <http://www.mrv.com/blog>
>>
>>
>> MRV Communications is a global supplier of packet and optical solutions
>> that power the world’s largest networks. Our products combine innovative
>> hardware with intelligent software to make networks smarter, faster and
>> more efficient.
>>
>>
>> The contents of this message, together with any attachments, are intended
>> only for the use of the person(s) to whom they are addressed and may
>> contain confidential and/or privileged information. If you are not the
>> intended recipient, immediately advise the sender, delete this message and
>> any attachments and note that any distribution, or copying of this message,
>> or any attachment, is prohibited.
>>
>> _______________________________________________
>> AusNOG mailing list
>> AusNOG at lists.ausnog.net
>> http://lists.ausnog.net/mailman/listinfo/ausnog
>>
>>
>
> _______________________________________________
> AusNOG mailing list
> AusNOG at lists.ausnog.net
> http://lists.ausnog.net/mailman/listinfo/ausnog
>
>


-- 

Damien Gardner Jnr
VK2TDG. Dip EE. GradIEAust
rendrag at rendrag.net -  http://www.rendrag.net/
--
We rode on the winds of the rising storm,
 We ran to the sounds of thunder.
We danced among the lightning bolts,
 and tore the world asunder
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20160210/8eef65ca/attachment.html>


More information about the AusNOG mailing list