[AusNOG] NIC Packets of death
Mark Doorey
SBS.User at netmark.net.au
Mon Feb 11 10:51:22 EST 2013
Here is some information that might help.
Testing http://www.kriskinc.com/intel-pod
Intel Packet of Death Testing:
As described in my blog post here
<http://blog.krisk.org/2013/02/packets-of-death.html> I experienced an
issue with certain Intel ethernet controllers. Here's how to see if your
controllers are affected.
For this simplified test you'll need two machines (one to replay the
packet and one to receive it) and you'll need to be on the same ethernet
segment. No routers or VLAN aware switches should be in the mix (but
dumb switches/hubs should be fine).
1. On the replay machine install tcpreplay <http://tcpreplay.synfin.net/>.
2. Connect the receiving machine to the network and bring the interface
up (IP address doesn't matter).
3. Replay one (or all) of the packets attached to this post from the
replay machine:
/sudo tcpreplay -v -i [transmitting interface] [pcap name]/
Example:
/sudo tcpreplay -v -i eth1 pod-icmp-ping.pcap/
If your controllers are affected the ethernet interface will lose link.
In many circumstances the only way to get the controller to work again
is to physically power off the machine and power it back on.
NOTE: These packets will be sent to the ethernet broadcast address (to
simplify testing). If you are affected by this issue it will take down
all of the ethernet interfaces on the connected network. If that is of
concern you should use tcpreplay-edit to set a specific destination
ethernet address:
/sudo tcpreplay-edit --enet-dmac=00:11:22:33:44:55 -v -i eth1
pod-icmp-ping.pcap/
Where "00:11:22:33:44:55" is the MAC address of the machine you'd like
to test.
Fixing:
As news of this issue spreads further some controllers are affected and
some aren't.That's more or less what I expected. Here's what I know
about fixingthis.
It has been my understanding that Intel provides at least two EEPROM
versions for this chip: one withBMC enabled and one without.My
controllers do not have BMC enabled, therefore my fix only applies to
non-BMC enabled controllers. This is unfortunate because theBMCenabled
controllers seem to be much more widely used. Even with thatother than
the very basics (MAC address and checksum) I don't know the meaning of
these values. Another reason not to reprogram the EEPROM on your NIC
based on what some guy on the internet told you.
With that being said hereis a diff between an affected EEPROM and a good
EEPROM:
OffsetValues
-0x0010:ff ff ff ff 6b 02 00 00 86 80 d3 10 ff ff 5a c0
+0x0010:01 01 ff ff 6b 02 d3 10 d9 15 d3 10 ff ff 58 85
-0x0030:c9 6c 50 31 3e 07 0b 46 84 2d 40 01 00 f0 06 07
+0x0030:c9 6c 50 21 3e 07 0b 46 84 2d 40 01 00 f0 06 07
-0x0060:ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
+0x0060:20 01 00 40 16 13 ff ff ff ff ff ff ff ff ff ff
Where the "-" lines were the bad EEPROM and the "+" lines were the good
EEPROM.
Under Linux you can view these values with ethtool:
/# ethtool -e [interface]/
and Intel's Official statement
Recently there were a few stories published, based on a blog post by an
end-user, suggesting specific network packets may cause the Intel®
82574L Gigabit Ethernet Controller to become unresponsive until
corrected by a full platform power cycle.
Intel was made aware of this issue in September 2012 by the blog's
author. Intel worked with the author as well as the original motherboard
manufacturer to investigate and determine root cause. Intel root caused
the issue to the specific vendor's mother board design where an
incorrect EEPROM image was programmed during manufacturing. We
communicated the findings and recommended corrections to the motherboard
manufacturer.
*It is Intel's belief that this is an implementation issue isolated to a
specific manufacturer, not a design problem with the Intel 82574L
Gigabit Ethernet controller.* Intel has not observed this issue with any
implementations which follow Intel's published design guidelines. Intel
recommends contacting your motherboard manufacturer if you have
continued concerns or questions whether your products are impacted.
Mark
On 10/02/2013 2:42 PM, Daniel O'Connor wrote:
> On 09/02/2013, at 20:56, Edwin Groothuis <edwin at mavetju.org> wrote:
>
>> On 7/02/13 10:37 , Heinz N wrote:
>>> Seems that certain packets can completely bring down certain Intel chipset network controllers.
>>>
>>> http://blog.krisk.org/2013/02/packets-of-death.html
>>>
>> http://communities.intel.com/community/wired/blog/2013/02/07/intel-82574l-gigabit-ethernet-controller-statement
>>
>> Intel blames it on a faulty EEPROM, but they don't say which mother board manufacturer.
>
> They don't provide any tools or instructions for testing the problem either.
>
> Hopeless :-/
>
> --
> Daniel O'Connor software and network engineer
> for Genesis Software - http://www.gsoft.com.au
> "The nice thing about standards is that there
> are so many of them to choose from."
> -- Andrew Tanenbaum
> GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C
>
>
>
>
>
>
> _______________________________________________
> AusNOG mailing list
> AusNOG at lists.ausnog.net
> http://lists.ausnog.net/mailman/listinfo/ausnog
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20130211/afa1a1aa/attachment.html>
More information about the AusNOG
mailing list