<div dir="ltr"><div>The maximum timeout a particular application can withstand a dropout without the session getting torn down, (which is implementation dependent), and the maximum timeout you can experience without _any_ applications being affected, are different things. If a TCP session is closing, and you pull the plug, the reset may be left wandering the network. If the network returns later than TIME_WAIT, there may be issues.<br><br></div>Paul Wilkins<br></div><div class="gmail_extra"><br><div class="gmail_quote">On 1 July 2015 at 17:38, Mark Smith <span dir="ltr"><<a href="mailto:markzzzsmith@gmail.com" target="_blank">markzzzsmith@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 1 July 2015 at 16:40, Paul Wilkins <<a href="mailto:paulwilkins369@gmail.com">paulwilkins369@gmail.com</a>> wrote:<br>
> Mark,<br>
> It's implementation specific (depends what options you pass to<br>
> setsockopt/sockstream).<br>
><br>
> There's problems resulting from having TIME_WAIT too long, with wandering<br>
> duplicate. Supposedly RFC1337 says TIME_WAIT should be at least 2 minutes,<br>
> but on my Linux box, I just timed a dropped socket, and it timed out after<br>
> one minute.<br>
><br>
<br>
</span>So I just checked Stevens Volume 1, which is where I read about this<br>
(back in 1998 or earlier IIRC). The timer that triggers retransmission<br>
is the Round Trip Timeout or TRO, which is measured and updated for<br>
the TCP connection (as, for example, the topology of the network could<br>
change while the TCP connection is active). Once the RTO times out,<br>
the retransmission intervals I mentioned occur.<br>
<br>
The TIME_WAIT timer you're describing is the one used after the TCP<br>
connection has closed, and it is there to ensure any TCP segments that<br>
belong to the closed TCP connection that might still be floating<br>
around the network expire.<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
<br>
> Paul Wilkins<br>
><br>
> On 1 July 2015 at 15:17, Mark Smith <<a href="mailto:markzzzsmith@gmail.com">markzzzsmith@gmail.com</a>> wrote:<br>
>><br>
>> On 1 July 2015 at 15:11, Mark Smith <<a href="mailto:markzzzsmith@gmail.com">markzzzsmith@gmail.com</a>> wrote:<br>
>> > On 1 July 2015 at 14:56, Mark Smith <<a href="mailto:markzzzsmith@gmail.com">markzzzsmith@gmail.com</a>> wrote:<br>
>> >> On 1 July 2015 at 12:33, Ross Wheeler <<a href="mailto:ausnog@rossw.net">ausnog@rossw.net</a>> wrote:<br>
>> >>><br>
>> >>><br>
>> >>> I had several links went down at 10:00 (give or take a few seconds) -<br>
>> >>> well,<br>
>> >>> not mine so much as my upstream - and it's been blamed on this issue.<br>
>> >>><br>
>> >><br>
>> >> So from a little bit of Human Computer Interaction (HCI) I studied<br>
>> >> many years ago, I remember that humans will wait for some sort of<br>
>> >> response for between 3 to 5 seconds. So if the period of your packet<br>
>> >> loss and the retransmission to recover from it is short enough, the<br>
>> >> humans effected may notice a slight delay, but they won't take any<br>
>> >> remedial actions themselves (i.e, they won't push the submit button<br>
>> >> again, and won't complain about it.)<br>
>> >><br>
>> ><br>
>> > This can also be particularly useful to know when cutting a set of<br>
>> > links over from an old piece of equipment to a new one. 3 to 5 seconds<br>
>> > is a bit tight to move the link, you can push people's response<br>
>> > expectations out in the outage notice (e.g., "between 7 and 8 am, we<br>
>> > will be conducting network maintenance. During this period, you may<br>
>> > encounter system delays of up to 5 to 10 seconds). I think asking<br>
>> > people to wait any longer than 10 seconds means this is a service<br>
>> > impacting outage and should be scheduled out of normal operating<br>
>> > hours.<br>
>> ><br>
>> > Also make sure that anything/any protocols that may cause the new<br>
>> > equipment to taking longer than 3 to 5 seconds to bring up the link is<br>
>> > temporarily or permanently switched off. Traditional STP would be a<br>
>> > prime example (make sure there isn't a loop in the network topology at<br>
>> > all, or at least during the cut-over window if you're going to switch<br>
>> > STP back on later). Bear in mind that your window from<br>
>> > "working-to-working" is the 5 to 10 seconds (or 3 to 5 normally), so<br>
>> > e.g., BGP sessions might come up within a few seconds, but if<br>
>> > downloading the full route table, resolving the routes and putting<br>
>> > them into the FIB is going to take more than 10 seconds, you'll have<br>
>> > to do a proper service impacting outage at an appropriate time.<br>
>> ><br>
>> > Finally, remember that UDP and DCCP don't do recovery from packet<br>
>> > loss, so if your apps are using them, they'll either have to be<br>
>> > tolerant of packet loss of up to 10 (or 3 to 5) seconds, do recovery<br>
>> > themselves, or should be rewritten to use TCP or SCTP.<br>
>> ><br>
>> > <snip><br>
>><br>
>> One last thing, you also need to know how the characteristics of and<br>
>> how persistent your reliable protocols are attempting to recover from<br>
>> packet loss. If your reliable protocol gives up within the 3 to 5 or 5<br>
>> to 10 second window, your customers/users will suffer an outage. TCP,<br>
>> for example, doesn't give up easily. If I recall correctly, it will<br>
>> try for up to around 9 minutes, and tries at doubling intervals up<br>
>> until 64 seconds and then each 64 seconds i.e., attempts at 1, 2, 4,<br>
>> 8, 16, 32, 64, 64, 64, ... seconds.<br>
>> _______________________________________________<br>
>> AusNOG mailing list<br>
>> <a href="mailto:AusNOG@lists.ausnog.net">AusNOG@lists.ausnog.net</a><br>
>> <a href="http://lists.ausnog.net/mailman/listinfo/ausnog" rel="noreferrer" target="_blank">http://lists.ausnog.net/mailman/listinfo/ausnog</a><br>
><br>
><br>
><br>
> _______________________________________________<br>
> AusNOG mailing list<br>
> <a href="mailto:AusNOG@lists.ausnog.net">AusNOG@lists.ausnog.net</a><br>
> <a href="http://lists.ausnog.net/mailman/listinfo/ausnog" rel="noreferrer" target="_blank">http://lists.ausnog.net/mailman/listinfo/ausnog</a><br>
><br>
</div></div></blockquote></div><br></div>