[AusNOG] Solar flare

Scott Howard scott at doc.net.au
Thu Mar 8 17:03:28 EST 2012


On Wed, Mar 7, 2012 at 9:47 PM, Peter Childs <PChilds at internode.com.au>wrote:

> <quote>
> > As with all computer and networking devices, the AS5400 is susceptible to
> > the rare occurrence of parity errors in processor memory.  Parity errors
> may
>

These excuses may seem far-fetched, and I used to get a lot of unbelieving
looks when I gave this excuse when I was an engineer at Sun - but the
simple fact is that it IS true.

And by true, what I mean is that cosmic rays can and do cause bits to flip,
especially in the type of memory used in CPU caches.  What may or may not
be true is if any specific occurrence was due to cosmic rays - because
obviously there's no way to prove that one way or another!

Sun learnt this the hard way when they released their "UltraSPARC"
processors many years ago.  During the design phase they chose to take the
cheaper/faster path of using Parity for the on-chip cache, which means that
errors could be detected, but not corrected. This approach had worked
perfectly in previous models, but in the UltraSPARCs the faster, larger and
higher density cache became very susceptible to bit flips as a result of a
number factors - with cosmic rays being a suspected cause of many such
errors.

At the end of the day, the fault in these cases is in the vendors choice of
using parity memory rather than ECC memory. The "cosmic rays" defense is
really just them admitting that their hardware can't handle normal
environmental circumstances.  (Sun moved to mirrored caches and/or ECC to
avoid such issues!)

Even PC manufacturers learnt the error in their ways with using parity
memory many, many years ago. Of course they took the opposite approach and
just removed the parity.  You can't get parity errors if you don't have
parity - and you can always just blame the resulting crash on Microsoft!!

So if your computer crashes in the next day or two, blame the manufacturer,
not Microsoft...  (If you're using a Mac.. well..)

  Scott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20120307/75b402bb/attachment.html>


More information about the AusNOG mailing list