[AusNOG] Buffers (was Re: Switching Recommendations)

Beeson, Ayden ABeeson at csu.edu.au
Mon Jul 15 11:12:27 EST 2013


LOL sorry Paul the guys are working on the phones now :-)

Paul Gear <ausnog at libertysys.com.au> wrote:



On 07/15/2013 10:03 AM, Lincoln Dale wrote:
On Mon, Jul 15, 2013 at 9:29 AM, Paul Gear <ausnog at libertysys.com.au<mailto:ausnog at libertysys.com.au>> wrote:
...
On a recent Packet Pushers show where Arista were talking about their new switches, they pointed out that their buffers seemed overly large, but at the high bandwidths they were serving, this was only 250 ms or thereabouts (my memory is a bit hazy, but i think it was about 512 MB per 10 Gbps port).

The podcast you're talking about was about switches that have ~125MB / 10G port, 4x/10x that for 40G/100G ports.. In all cases if you actually had something consuming all those buffers, its ~12msec.

Reality is that its not quite that simple, as the switches in question are VoQ based, so that buffer that is physically on ingress representing queueing on output and is in fact distributed queuing.
Switch buffers are also never 100% effective utilization either.  (silicon stores packets in 'cells' and those are not variable-sized cells.)

Pardon my ignorance, but VoQ == Virtual Output Queuing?  Is the Wikipedia description more or less accurate? https://en.wikipedia.org/wiki/Virtual_Output_Queues

I could talk for days in this topic having done all analysis and simulation on the 'right' about of buffer on switches but suffice to say what is 'right' depends on the place in the network and # of simultaneous TCP flows going thru the box and degree of incast/oversubscription.

Feel free to wax lyrical.  Today's my study day and i'm trying to learn something while i wait for CSU to fix their phones. :-)

In the case of the company I work for yes we have done a lot of analysis on this, both by having telemetry data of actual buffer queue depths in production environments but also testing of various workflows of modern applications and traffic flows.
Its how we determined (for example) to use 2Gbit DDR3-2166 rather than 1Gbit DDR3-2166 parts when building said switch.

What sort of testing process/software did you use for those sorts of test flows?  I'm guessing the licensing costs and the time spent would vastly outweigh the budget for switching of a project like the one for which Joseph started this thread.

If you were interested in theory/simulation/practice on this, its actually something I gave a talk at CAIA (http://caia.swin.edu.au/) last year. More than happy to share the slides/content if there is no video recording of it.

It appears they don't keep much at all: http://caia.swin.edu.au/seminars/details/120223A.html  Share away! :-)


How does one determine the optimal buffer size (and hence switch selection) for a particular environment?  Is there a rule of thumb based on the bandwidth of the link and the maximum time one wants a packet to queue?  (And how would one even determine what this maximum might be?  I would think that it varies depending upon the application.)  I guess this paragraph's questions are mostly directed to Greg & Lincoln - in the cases you've mentioned, how did you know that small buffers were the problem?

Unfortunately most ethernet switches simply haven't provided the telemetry to indicate how close they are to running out of buffers until its too late and you've fallen off the cliff and have large numbers of drops going on.
(its actually worse than this: many switches don't even provide accurate drop counters when they are dropping packets)
Historically many switches had 'dedicated' buffers/port and didn't have overflow/shared pools for dealing with

Even if accurate drop counters are available, how many people actually monitor those?

Thing about TCP is it still works even if you have drops. Just for many environments "working" isn't good enough, they want "working well."  The DataSift blog I pointed to is a good example.

My fear with all of this as that for most environments working is good enough, and working well costs more than they are prepared to spend.

The short answer is that there is no single 'right' answer for what is appropriate.  It depends on the traffic load, distribution and what the underlying apps/hosts/servers are doing.  Which may change over time too.

Are there any quick wins available to the budget end of town?  Are there rules of thumb or quick estimates that can help determine if it's an issue?

Paul


[cid:csu-logo486c.bmp]<http://www.csu.edu.au/>

|   ALBURY-WODONGA   |   BATHURST   |   CANBERRA   |   DUBBO   |   GOULBURN   |   MELBOURNE   |   ONTARIO   |   ORANGE   |   PORT MACQUARIE   |   SYDNEY   |   WAGGA WAGGA   |

________________________________
LEGAL NOTICE
This email (and any attachment) is confidential and is intended for the use of the addressee(s) only. If you are not the intended recipient of this email, you must not copy, distribute, take any action in reliance on it or disclose it to anyone. Any confidentiality is not waived or lost by reason of mistaken delivery. Email should be checked for viruses and defects before opening. Charles Sturt University (CSU) does not accept liability for viruses or any consequence which arise as a result of this email transmission. Email communications with CSU may be subject to automated email filtering, which could result in the delay or deletion of a legitimate email before it is read at CSU. The views expressed in this email are not necessarily those of CSU.

Charles Sturt University in Australia<http://www.csu.edu.au> The Grange Chancellery, Panorama Avenue, Bathurst NSW Australia 2795 (ABN: 83 878 708 551; CRICOS Provider Numbers: 00005F (NSW), 01947G (VIC), 02960B (ACT)). TEQSA Provider Number: PV12018
Charles Sturt University in Ontario<http://www.charlessturt.ca/> 860 Harrington Court, Burlington Ontario Canada L7N 3N4 Registration: www.peqab.ca<http://www.peqab.ca>

Consider the environment before printing this email.

Disclaimer added by CodeTwo Exchange Rules 2007
www.codetwo.com<http://www.codetwo.com>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: csu-logo486c.bmp
Type: image/bmp
Size: 37976 bytes
Desc: csu-logo486c.bmp
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20130715/0c3756f3/attachment.bin>


More information about the AusNOG mailing list