[AusNOG] Telstra mobile down "nationwide"

Tim Raphael raphael.timothy at gmail.com
Wed Feb 10 14:54:31 EST 2016


I would agree with this,

Various releases noted that 10-15% of customers were affected as it was not
complete network outage. This sounds a lot like the behaviour you would
expect from a "faulty node in the cluster" type scenario.

- Tim

On Wed, Feb 10, 2016 at 11:44 AM, Clay Quinn <cquinn at mrv.com> wrote:

> Apparently customers were still being routed through the faulty node,
> essentially the equivalent of having a dead server in a load-balanced
> cluster (which results in a percentage of connections not establishing).
> So it sounds like the faulty MME was bought back into service
> (accidentally).  I’m not an expect, this is just the explanation offered to
> me by a Telstra employee familiar with the situation…
>
>
>
> *From:* Shane Short [mailto:shane at short.id.au]
> *Sent:* Wednesday, 10 February 2016 2:37 PM
> *To:* Clay Quinn <cquinn at mrv.com>
> *Cc:* ausnog at lists.ausnog.net
> *Subject:* Re: [AusNOG] Telstra mobile down "nationwide"
>
>
>
> So from what I've been reading apparently an engineer restarted an MME
> without migrating the customers manually to another MME, which I assume
> then booted a heap of them off the network? Apparently the registration
> storm from that then overloaded the remaining MME's which then caused all
> of the things to break?
>
> I hear they're blaming the engineer because he didn't migrate customers
> over to another MME, but what if they had an actual failure in the MME? It
> sounds like the network isn't designed to handle the resulting
> authentication (over)load and completely falls over-- so as much as the
> engineer triggered the fault, I'm not sure it's entirely his fault that it
> then went to complete shit?
>
> (I'll admit I'm very green on the UTRAN/E-UTRA stuff, but it's something I
> have immense curiosity about, if someone can correct me, or point me in the
> right direction if I'm wrong here, I'd be much appreciative)
>
> -Shane
>
> Clay Quinn wrote:
>
> Can confirm it was indeed an MME failure…
>
>
>
> Cheers
>
> Clay
>
>
>
> *From:* AusNOG [mailto:ausnog-bounces at lists.ausnog.net
> <ausnog-bounces at lists.ausnog.net>] *On Behalf Of *Narelle
> *Sent:* Tuesday, 9 February 2016 6:23 PM
> *To:* Joe Saxton <Joe.Saxton at workforce.com.au>
> <Joe.Saxton at workforce.com.au>
> *Cc:* ausnog at lists.ausnog.net
> *Subject:* Re: [AusNOG] Telstra mobile down "nationwide"
>
>
>
>
>
> Well, it is possible for a 4G HSS (HLR in the 3G world) to go down. That
> report however makes me think it might have been an MME... Given it was
> national, however, I'm thinking HSS. Though with millions of devices
> disconnecting and re-registering the traffic load cascades phenomenally and
> all sorts of fault behaviour will appear.
>
>
>
> Given the person in the interview didn't identify the "node", it still
> isn't clear exactly what went wrong at all.
> http://servicestatus.telstra.com/ doesn't really give enough clues at all.
>
>
>
> In the old 3G networks the RNC couldn't be set up in a redundant
> configuration, so if you didn't have enough, or one failed, you couldn't
> redirect traffic from all the connected base stations. Now you can with
> MMEs, and that would be consistent with this description.
>
>
>
> But - it sounds more like a call server issue (CSCF) if it is affecting
> some fixed networks. Also it is national. Call servers you deploy more
> centrally.
>
>
>
> Then again, the spokesperson also says "one of the nodes used to manage
> voice and data traffic between devices and the network started to
> malfunction" - so again I'm thinking MME...
>
>
>
>
>
> Narelle Clark
>
>
>
>
>
> PS - 000 is a mapping rather than something embedded in the system. You
> condition your network for that sort of local feature. Personally I'd
> always go for 112 on a mobile.
>
>
>
>
>
> On Tue, Feb 9, 2016 at 5:18 PM, Joe Saxton <Joe.Saxton at workforce.com.au>
> wrote:
>
> It no doubt this would have been human error. With all the redundant
> systems in place, you just wonder an outage like this and this long more
> likely human error.
>
>
>
> *From:* AusNOG [mailto:ausnog-bounces at lists.ausnog.net] *On Behalf Of *Cameron
> Murray
> *Sent:* Tuesday, 9 February 2016 4:45 PM
> *To:* James Gray <james at gray.net.au>
> *Cc:* ausnog at lists.ausnog.net
> *Subject:* Re: [AusNOG] Telstra mobile down "nationwide"
>
>
>
>
> http://www.9news.com.au/technology/2016/02/09/13/11/reports-of-telstra-mobile-services-outage
>
>
>
> Keep backing up the bus...
>
>
>
> On Tue, Feb 9, 2016 at 2:54 PM, James Gray <james at gray.net.au> wrote:
>
> In addition to the traditional swiss-army knife known as "telnet", getting
> "curl" to dump just the HTTP headers is also a handy one to keep up your
> sleeve:
>
> 0:>*curl -I http://triplezero.com.au <http://triplezero.com.au>*
>
> HTTP/1.1 200 OK
>
> Connection: close
>
> Date: Tue, 09 Feb 2016 04:45:18 GMT
>
> Server: Microsoft-IIS/6.0
>
> X-Powered-By: ASP.NET
>
> Content-Type: text/html; charset=UTF-8
>
>
>
> Also telnet wont work with SSL sites (ie, https), but if you have openssl
> installed, you can do this instead:
>
> 0:>*openssl s_client -quiet -connect www.google.com:443
> <http://www.google.com:443>*
>
>
>
> The openssl method also works on other SSL-enabled service like POP3S and
> IMAPS etc. I have it aliased in my shell config:
> *alias stelnet="openssl s_client -quiet -connect"*
>
>
>
> ...then all I need to do is: "stelnet host:port"
>
>
>
> Just good to have tucked away in case you need to break out the
> command-line hammer.
>
>
>
> Cheers,
>
>
>
> James
>
>
>
> On 9 February 2016 at 15:15, Ross Wheeler <ausnog at rossw.net> wrote:
>
>
>
> On Tue, 9 Feb 2016, Shane Chrisp wrote:
>
> Yep, that was just my fat fingers. I am trying to get to the
> triplezero.com.au but no go.
>
>
>
> traceroute to www.triplezero.gov.au (115.178.104.72), 30 hops max, 60
> byte
>
> ...
>
> 11  bundle-ether2.civ.core2.canberra.telstra.net (203.50.6.82)  71.834 ms
> 71.845 ms  70.450 ms
> 12  Bundle-Ethernet1.civ-edge901.canberra.telstra.net (203.50.8.35)
> 69.126 ms 69.138 ms  69.042 ms
> 13  telstr1248.lnk.telstra.net (165.228.21.206)  68.991 ms  68.981 ms
> 68.926 ms
> 14  * * *
> 15  * * *
> 16  * * *
>
>
>
> Traceroute is only one tool in a toolbox, and frequently not as helpful as
> you might hope.
>
>  5  bundle-ether2.chw-edge902.sydney.telstra.net (203.50.11.105)  2.002 ms
>  6  bundle-ether2.dkn-core1.canberra.telstra.net (203.50.6.129)  8.918 ms
>  7  bundle-ether2.civ.core2.canberra.telstra.net (203.50.6.82)  9.334 ms
>  8  Bundle-Ethernet1.civ-edge901.canberra.telstra.net (203.50.8.35)
> 7.822 ms
>  9  telstr1248.lnk.telstra.net (165.228.21.206)  9.189 ms
> 10  *
> 11  *
>
> Yes, it appear to not be reachable....
>
> However using another tool  it clearly IS working....
>
> # telnet triplezero.gov.au 80
> Trying 2403:d500::48...
> telnet: connect to address 2403:d500::48: No route to host
> Trying 115.178.104.72...
> Connected to triplezero.gov.au.
> Escape character is '^]'.
> GET / http/1.0
>
> HTTP/1.1 403 Forbidden
> Cache-Control: no-cache
>
>
>
> There's also a hint there....
>
>
> _______________________________________________
> AusNOG mailing list
> AusNOG at lists.ausnog.net
> http://lists.ausnog.net/mailman/listinfo/ausnog
>
>
>
>
> _______________________________________________
> AusNOG mailing list
> AusNOG at lists.ausnog.net
> http://lists.ausnog.net/mailman/listinfo/ausnog
>
>
> ------------------------------
>
> *Note:*
>
> This message is for the named person's use only.  It may contain
> confidential, proprietary or legally privileged information.  No
> confidentiality or privilege is waived or lost by any mistransmission.  If
> you receive this message in error, please immediately delete it and all
> copies of it from your system, destroy any hard copies of it and notify the
> sender.  You must not, directly or indirectly, use, disclose, distribute,
> print, or copy any part of this message if you are not the intended
> recipient. *Workforce International Pty Ltd* and any of its subsidiaries
> each reserve the right to monitor all e-mail communications through its
> networks. Any views expressed in this message are those of the individual
> sender, except where the message states otherwise and the sender is
> authorised to state them to be the views of any such entity.
>
>
> ------------------------------
>
>
> _______________________________________________
> AusNOG mailing list
> AusNOG at lists.ausnog.net
> http://lists.ausnog.net/mailman/listinfo/ausnog
>
>
>
>
>
> --
>
>
>
> Narelle
> narellec at gmail.com
>
> [image: E-Banner] <http://www.mrv.com/blog>
>
>
> MRV Communications is a global supplier of packet and optical solutions
> that power the world’s largest networks. Our products combine innovative
> hardware with intelligent software to make networks smarter, faster and
> more efficient.
>
>
>
> The contents of this message, together with any attachments, are intended
> only for the use of the person(s) to whom they are addressed and may
> contain confidential and/or privileged information. If you are not the
> intended recipient, immediately advise the sender, delete this message and
> any attachments and note that any distribution, or copying of this message,
> or any attachment, is prohibited.
>
> _______________________________________________
>
> AusNOG mailing list
>
> AusNOG at lists.ausnog.net
>
> http://lists.ausnog.net/mailman/listinfo/ausnog
>
>
> [image: E-Banner] <http://www.mrv.com/blog>
>
>
> MRV Communications is a global supplier of packet and optical solutions
> that power the world’s largest networks. Our products combine innovative
> hardware with intelligent software to make networks smarter, faster and
> more efficient.
>
>
> The contents of this message, together with any attachments, are intended
> only for the use of the person(s) to whom they are addressed and may
> contain confidential and/or privileged information. If you are not the
> intended recipient, immediately advise the sender, delete this message and
> any attachments and note that any distribution, or copying of this message,
> or any attachment, is prohibited.
>
> _______________________________________________
> AusNOG mailing list
> AusNOG at lists.ausnog.net
> http://lists.ausnog.net/mailman/listinfo/ausnog
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20160210/bb7397a9/attachment.html>


More information about the AusNOG mailing list