[AusNOG] Office Link Needed (Fibre or alike) Sydney

Bevan Slattery bevan at slattery.net.au
Fri Oct 23 05:56:07 EST 2015


#gold

[b]

> On 23 Oct 2015, at 3:31 AM, Christopher Pollock <cpollock at twitch.tv> wrote:
> 
> I wrote this down so as to not forget it, about my visit to 30 Ross.  I call it "Reason For Outage: Wolverine"
> 
> There are a lot of public datacentres that people use in Australia. Most of them have an affilliation or ownership in some way with a major telco, because what better way to connect with customers than to make it easy and available?
> It was a normal day, started like any other. Walked through the office, grateful that there was no water anywhere. Checked the airconditioning temperatures, and nothing was blowing up. It looked like the start of a good day.
> The NSW IX was in Sydney, nearly 1000 km from me, but the state itself was unmanned. Nothing /really/ ever went wrong there so we made do. Until today.
> I noticed an alert for a customer being offline in the public colo DC. Unusual, but not unheard of. Customers reboot routers all the time. When it didn’t come back, I decided they had disconnected the session for a reason, and sent them an email ‘hey do you know your peering is offline?’. They wrote back saying they thought it was us, so I checked the switch they were connected to, and sure enough the port was down. Hmm. Okay, probably a cabling fault or dislodged cable. I’d schedule in a time to go down and check it out, when I had more work lined up. Then another one droppped off. And a third. By the time we were five peers down we were in full panic station mode. Was our switch dying? God I hope not.
> 15 minutes later I was on a train to the airport carrying only what I had on me, hopped on a plane, and two hours later I was in Sydney. By the time I got there, another 10 peers had dropped offline and the phones were running hot. I turned mine off so as to stop getting unhelpful calls and diverted it all back to reception.
> Now, to explain a little about how public datacentres often work, generally the colo provider would charge you an exorbinant amount to install cabling between racks or to run patch leads, in the thousands. However, anyone with a carrier license & cabling license and the right tools could run up their own in 15 minutes. This happened many times. Thousands of times. I would not be underestimating it to say that there were at least 5,000 unregulated, unregistered cables in that datacentre floor.
> When I finally rocked up to the DC, 15 of our 20 or so peers were offline. I ran over to our rack and checked the switch. It was fine, no errors. I ran TDR testing on the ports to check for cable lengths, connectivity, shorts, any kind of Layer 1 or 2 problem. All the cables registered as an open pair, meaning they were not connected at the other end. This was thoroughly confusing. So I checked the actual lengths on these TDR traces and they were actually showing as only 15m away. What the hell? Most of these cable runs with 50 - 80m - why did they stop at 15m?
> I walked out to about 15m and walked a circumference around the rack. When I rounded the corner, the blood drained from my face (as it so often does in these situations). I knew exactly what had happened.
> A new tech for the colo provider was not aware of a little thing called the Telecommunications Act which allows you to run these kinds of cables. So he’d gone through all the locally-paid patches, which were done in a specific colour, and figured out that anything not bright yellow must have been ‘illegal’. He had four floor tiles removed, and was standing over the cable pits, dual-weilding side-cutters, one in each hand. Cutting anything the wrong colour, like a boxer pounding away with left-right combos over and over. Slashing away at our infrastructure like Wolverine berzerker style. There was a pile of cables next to him that, I shit you not, was the size of a small car.
> Yelling and sprinting over, I demanded that he stop what he was doing. I was about to say ‘..and put them back the way they were when I realised he must have been at it for 6+ hours and reconnecting them all was going to be impossible. He’d destroyed the infrastructure for god knows how many businesses. Now, I’m pretty calm most of the time, even in the face of danger, but this .. this made me lose my shit.
> Me: WHAT ARE YOU DOING STOP
> DC Tech: I’m removing the inactive and unauthorised patches. I have an order from management to do it.
> Me: ARE YOU A F**KING IDIOT? DO YOU REALISE THAT THESE ARE ACTIVE TELECOMMUNICATIONS SERVICES AND THAT INTERFERING WITH OR DISCONNECTING THEM IS A FEDERAL OFFENSE UNDER THE TELECOMMUNICATIONS ACT 1997!? YOU CAN GO TO FUCKING JAIL FOR 10 YEARS FOR ONE AND YOU’VE DONE LIKE TWO HUNDRED NOW STOP BEFORE YOU RUIN ANYONE ELSE’S BUSINESS
> Finally, it was someone else’s face going pale. He agreed to stop, and I ran to the nearest supply store, bought a few boxes of cables and supplies and set to furiously running new cables to all our customers. He helped me re-run cabling for all of our customers, and within maybe two hours they were all back online.
> We sent the colo provider an invoice for the expenses incurred during troubleshooting / rectification and they grudgingly agreed to pay for it.
> I still can’t get the image of that giant ball of cables out of my head. It was a horribly hybrid of a giant aborted fetus and an ugly medusa, thousands of RJ45 heads pointing in all directions.
> Heading back to the airport, I sat with my head in my hands, regretting a lost day’s work, and trying to figure out how I would word this Post Incident Report.
> Reason for outage: AAPT is the worst.
> Eventually I handed the PIR job over to someone else, as I’d long since lost the ability to be civil about it, there was only one thing left to do.
> Go to the pub, and cleanse the day with purifying beer.
> 
> 
> --
> Christopher Pollock  |   Twitch.tv   |   Network Development Engineer   |   415-361-3042   |   Skype–christopherpollock   |   Twitter–chhopsky
> 
> Twitch in the News:
> CBS News   "For a video game lover, like myself, the site is addictive. "
> 
>> On Thu, Oct 22, 2015 at 1:28 AM, Purdon, Bob <bobp at purdon.id.au> wrote:
>>> On 22 October 2015 at 13:54, Bevan Slattery <bevan at slattery.net.au> wrote:
>>> OMG.  Only to be topped by 530 Collins.  I remember PIPE taking over the Ausbone rack there.  I remember Steve or Bob saying they couldn't get the floor tiles to sit on the raised structure meaning the tiles were effectively "floating" on the cables :). The cabling in that rack was possibly the worst in living memory.
>> 
>> I know I said that once, and Steve may well have also.  IIRC, the raised floor was 300mm above the slab and the cables were pressed up against the underside of the floor tiles.  I've probably got a photo somewhere.  Had to stand and/or gently jump on one particular tile to get it to sit down on the stringers again.
>> 
>> In a move that obviously went against the convention at that site, I did in fact remove quite a few cables as part of cleaning that rack up, which in turn helped that tile fit just a little bit better.
>> 
>> I also recall running a cable or two in there before I was at PIPE and despite there being trays (which I actually used), the general cable routing convention seemed to be just go point A to point B, and if that means going diagonally across the room then that's what happened.
>> 
>> Wonder how many tonnes of cable there were under that floor? :-)
>> 
>> _______________________________________________
>> AusNOG mailing list
>> AusNOG at lists.ausnog.net
>> http://lists.ausnog.net/mailman/listinfo/ausnog
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20151023/2ea82870/attachment.html>


More information about the AusNOG mailing list