[AusNOG] International link issue

Thomas Sulkiewicz TSulkiewicz at Toshiba-TAP.com
Fri Feb 24 11:46:37 EST 2012


its difficult to litigate against gremlins. Well, at least, I’ve never seen one take the stand.
The same can’t be said for corporations.. ..especially if they say sorry.


From: ausnog-bounces at lists.ausnog.net [mailto:ausnog-bounces at lists.ausnog.net] On Behalf Of Tom Storey
Sent: Friday, 24 February 2012 11:26 AM
To: Aaron Swayn
Cc: ausnog at ausnog.net
Subject: Re: [AusNOG] International link issue

Maybe I expect too much, but...

I wish the news would report more along the lines of what Aaron said, instead of saying that Telstras international link "stopped processing requests." Maybe not quite as technically detailed, but at least an accurate account.

Regarding a *hardware* failure on a *Cisco* border router somehow causing the breakage... How does a *hardware* failure somehow cause this (unless it just happens to be a perfect storm of configuration corruption), and for what purpose does dropping a vendor name achieve anything? Do vendors really sit back and let the reputation of their gear be tarnished by some epically fail disaster PR?

Is it really necessary to dumb it down to the lowest conceivable level? To me it just seems that if journos report this kind of stuff, the general public then thinks thats what *actually* happened (some probably literally), and you havent really informed anyone or made anyone any wiser or smarter.

I guess it really is just easier to blame it on some gremlins than to admit some people cocked up (big time) and there is actually someone to blame, and I do expect too much. :-)

On 23 February 2012 23:58, Aaron Swayn <aaron at swayn.com<mailto:aaron at swayn.com>> wrote:
From what I understand is the BGP interface between Telstra AUS (AS1221) and Reach aka Telstra worldwide (AS4637) went down because….

Dodo advertised 390k prefixes to Telstra, which they accepted.
Telstra then advertised the 390k prefixes to Reach
Reach, correctly assuming that Telstra should never have this many routes and shut down BGP due to ‘max-prefixes’ being breached.
This causes much route flapping and some ISPs with route dampening, did just that to AS1221 prefixes to prevent CPU overload (Telstra advertise normally something like 800+ prefixes, or something around that number)

Telstra should have had a max-prefix in place on the Dodo peer to protect from this (although should be filtered correctly to protect its own customer base, but to what level is debatable. But as Dodo is not a Teir 1 carrier, I don’t think they should be that relaxed in the peering configuration IMHO. Only 3 carriers are Tier 1 in Australia and only they should be that relaxed to allow all prefixes. It seems Reach however doesn’t trust Telstra though).
Reach did the right thing, as they are one of the few true global Tier 1 peering provider. Hence, Reach never expects to see the internet come from Telstra, only domestic routes which Telstra peers with.

I’m sure the Instructor lead training courses for CCNP and BGP will talk about this incident for the next 20 years on what not to do. I seem to recall one comment “You don’t want to become famous, so always check what you’re doing before you interface with the internet”.

From: ausnog-bounces at lists.ausnog.net<mailto:ausnog-bounces at lists.ausnog.net> [mailto:ausnog-bounces at lists.ausnog.net<mailto:ausnog-bounces at lists.ausnog.net>] On Behalf Of Will Tardy

Sent: Friday, 24 February 2012 10:30 AM
To: ausnog at ausnog.net<mailto:ausnog at ausnog.net>
Subject: Re: [AusNOG] International link issue

Telstra claims they had an international link down:

http://www.zdnet.com.au/telstra-hit-by-nationwide-data-outage-339332310.htm

If that happened at the same time as DODO incorrectly sending Telstra the full BGP table, could that explain why Telstra black-holed all-routes plus pumped all of it's own traffic via dodo?
On 24 February 2012 10:02, Wade Millican <Wade.Millican at echoent.com.au<mailto:Wade.Millican at echoent.com.au>> wrote:
Hi All,

What I'm yet to understand about this outage is why DODO's AS_PATH was seen as shorter than anything Telstra already had.

An earlier posted look at routes(below), thanks Gavin, shows all routes from Telstra taking hops to DODO, then Optus or PIPE before moving to the destination. Surely Telstra would have had better routes than pushing all traffic 2 hops out of it's way.

AS_PATH does not explain how Telstra accepted these as the active routes. Even if all routes were accepted, Telstra still has better routes.

Can anyone explain what BGP Metric was modified/used that pushed traffic over longer AS_PATHs?






*> 1.22.161.0/24<http://1.22.161.0/24>    165.228.157.73         100     80      0 1221 38285 7474 7473 55410 45528 i

*> 1.22.162.0/24<http://1.22.162.0/24>    165.228.157.73         100     80      0 1221 38285 7474 7473 55410 45528 i

*> 1.22.163.0/24<http://1.22.163.0/24>    165.228.157.73         100     80      0 1221 38285 7474 7473 55410 45528 i

*> 1.22.167.0/24<http://1.22.167.0/24>    165.228.157.73         100     80      0 1221 38285 7474 7473 6453 4755 45528 i

*> 1.22.168.0/24<http://1.22.168.0/24>    165.228.157.73         100     80      0 1221 38285 7474 7473 6453 4755 45528 i

..

*  14.201.64.0/24<http://14.201.64.0/24>   165.228.157.73         100     80      0 1221 38285 18398 7545 7545 i

Thanks,

Wade
--
Wade Millican
Technical Consultant Team Lead
Hemisphere Infrastructure Support
Information Technology
Echo Entertainment Group Limited

2 Edward St
Pyrmont NSW 2009

T: +61 2 9657 7460<tel:%2B61%202%209657%207460>
M: +61 (0) 400 192 485<tel:%2B61%20%280%29%20400%20192%20485>
wade.millican at echoent.com.au<mailto:wade.millican at echoent.com.au>
www.echoentertainment.com.au<http://www.echoentertainment.com.au>
[cid:image001.png at 01CCF2E9.B23A03B0]
From: "Ramsay, Paul" <pramsay at uecomm.com.au<mailto:pramsay at uecomm.com.au>>
Date: Wed, 22 Feb 2012 22:20:41 -0800
To: "ausnog at ausnog.net<mailto:ausnog at ausnog.net>" <ausnog at ausnog.net<mailto:ausnog at ausnog.net>>
Subject: Re: [AusNOG] International link issue

Yes, this reinforces the Rule of Trust. Don’t trust your BGP peers and ensure your filters are in place, configured correctly and working, you can’t transfer blame.
It can cost you big $$ and pain if you inadvertently turn yourself into a transit peer because your upstreams may prefer to send traffic where they can make $$ from.

From: ausnog-bounces at lists.ausnog.net<mailto:ausnog-bounces at lists.ausnog.net> [mailto:ausnog-bounces at lists.ausnog.net] On Behalf Of Sean K. Finn
Sent: Thursday, 23 February 2012 5:09 PM
To: 'ausnog at ausnog.net<mailto:'ausnog at ausnog.net>'
Subject: Re: [AusNOG] International link issue

It’s easy to describe for all the media types watching..
(And I’m not sure why its not being put out there in Laymans terms).

From the routes seen at various points, and reported on the WAIX mailing list earlier..



Dodo told Telstra that Dodo was the rest of the Internet.

Telstra Believed Dodo.

Telstra entire system tried to use DODO as their ISP instead of everyone else Telstra is connected to.

Needless to say this didn’t work, the pipes got Jammed.

Telstra should have filtered the announcement from Dodo, butdidn’t.

Filtering is in place as a form of control (which is used instead of trust).

Filtering obviously wasn’t in place, or didn’t work, so anything that Dodo told Telstra about where to find the Internet, Telstra believed.

This happens quite often, I’ve heard of this happening on peering exchanges within Australia, too. Just never at an organizational level as big as Telstra.

Over and Out.



This message and its attachments may contain legally privileged or confidential information. It is for the intended addressee(s) only.
If you are not the intended recipient you must not disclose or use the information contained in it. If you have received this email in error please notify us immediately by return email and delete the document.
Any views expressed in this message are those of the individual sender, except where the sender specifies and with authority, states them to be the views of the Company.
Uecomm accepts no liability for any damage caused by this email or its attachments due to viruses, interference, interception, corruption or unauthorised access.
________________________________
This e-mail message has been scanned for Viruses and Content and cleared by NetIQ MailMarshal
________________________________

_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net<mailto:AusNOG at lists.ausnog.net>
http://lists.ausnog.net/mailman/listinfo/ausnog


_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net<mailto:AusNOG at lists.ausnog.net>
http://lists.ausnog.net/mailman/listinfo/ausnog


This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No confidentiality or privilege is waived or lost by  any mis-transmission. If you receive this message in error, please immediately delete it  and all copies of it from your system, destroy any hard copies of it and notify the sender.   You must not, directly or indirectly, use, disclose, distribute, print, or copy any part  of this message if you are not the intended recipient. Toshiba Australia reserves the  right to monitor all e-mail communications through its networks.  Thank You."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20120224/302a8be2/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 13740 bytes
Desc: image001.png
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20120224/302a8be2/attachment.png>


More information about the AusNOG mailing list