[AusNOG] "stateless TCP" for DNS

Mon Nov 15 11:49:58 EST 2010

Hi Terry,

thanks for your feedback. I substantially agree with your view of the 
issue. See further comments inline.

On 15/11/2010 03:12, Terry Manderson wrote:
[..]
>
> The 2009 OARC DITL analysis showed that, at the time, 65% of resolvers used a message size of 4096.
> At a meeting at IETF in Beijing last week someone suggested that they see 70% of queries, I don't have any data to support this but it seems plausible that there is a growth in EDNS support. Further, in reading the paper "Improving DNS performance ... in FreeBSD" I got stuck on the suggestion in the paper which says "Most DNS servers are configured to allow only a maximum UDP packet size of 512 bytes", I assume you mean RFC1035 section 2.3.4 "will not output a packet longer than 512 bytes long". However we do have EDNS (RFC2671) which is not mentioned in the paper. The current default for edns-udp-size in bind is 4096. And surely as DNSSEC is deployed, requiring new(er) DNS server releases, server response size capability will be on an incline.

In one of the references of the paper we point to a RIPE study, which 
shows an increase in EDNS capable servers as well. In our tests anyhow, 
we set up a BIND 9 DNS server in FreeBSD, and EDNS was not enabled by 
default, thus limiting UDP response packets to a maximum of 512 Bytes.

We also point out that DNS response sizes can be increased to 4096 
Bytes, but yes, we don't mention EDNS.

Additionally, we are concerned about DNS responses making it back to the 
client through the network, which is not covered by any of these 
studies: e.g. of the 70% queries seen/answered, how many responses make 
it actually back to the client?

Or do i understand that wrong, and the 70% refers to complete DNS 
lookups including responses?

>
> That obviously doesn't address the MTU fragmentation concerns that are highlighted in the paper, but how much should we protect entities with poor performing network equipment from real development that is intended to improve the security of the DNS naming system?

This is exactly the problem. I definitely agree, that it _shouldn't_ be 
the DNS server to solve the problems of a poor performing network.

What we fear though, is, that it might happen, that a major OS vendor 
decides to "help" all the people in such a problematic network 
situation, by changing the DNS clients to solely use TCP.

Now if the 70% rate above refers to a complete DNS lookup, including 
responses arriving at the client, something like that might never happen.

>
> Don't get me wrong. I think it is kinda cute to create a TCP-lightweight proxy to attempt to bypass the on path MTU-isms. But is that enough of an incentive to mask the problem instead of fixing it? Maybe, maybe not.

I agree. As Grenville already said: we might solve a non-problem. 
Actually, let's hope so :-)
>
> I would also suggest that it is an overstatement that your DNS server would melt. The default tcp-clients value in bind9 is 100 (simultaneous). I think simply this represents a client problem, as the nameserver configured out of the box will hardline TCP clients at 100 and care not, so probably not so much a server problem as I see it.

Interesting. Didn't know that.
>
>> The project is at http://caia.swin.edu.au/ngen/statelesstcp/,
>> including a tarball of patches to FreeBSD 9.
>
> In regard to the first conclusion in the paper.
>
> This really isn't a problem with DNS. It is a MTU problem in middleware that DNS triggers which has ramifications to DNS clients on the internet who might be stuck behind firewalls and other boxes that do the wrong thing. The real issue here is about service to a DNS client. Unless a DNS client uses TCP from the outset, the query will be done on UDP, some DNS dancing (retry/timeout etc) to get a TCP response. I would posit that by the time this has happened the resource concern on a DNS server is minor.

Heh, you're right. We should have worded that differently, and replace 
"current DNS system" with "current Internet". We agree, it's a middleox 
problem, or as you say: the service to the client.
As said, what we fear is, that bad service to the client forces changes 
to the client which could affect DNS servers. If 90% of the clients in 
the world start to use TCP only, how will that affect DNS servers?

Limiting the servers to 100 clients doesn't seem a nice move to me...

But again... I might have misunderstood the results of the IETF talk in 
the first part of your message, and that's sort of crucial :-)

Otherwise we really would need to do some more research into that, as 
Roland suggested.

>
> There is also a DNS specific discussion on this in rfc3226

Cool, that RFC might solve the problem straight away - if middlebox 
producers stick to it.
>
> It might also be worthwhile reading the paper from 2003 suggesting transactional tcp for dns based on motivations to limit security (ddos) events. http://www.ne.jp/asahi/bdx/info/depot/IPSJ-JNL4408024-k2r.pdf

Hmm, thanks for that, missed it completely. It's an interesting 
approach, but different from ours, where we try to apply a quick fix, 
which doesn't need any change at the client side.
>
> The next thought I have, is along the lines of the good old SYN flood attack.. or other security facets... :-)

Well, although our idea was less focused on security, we think the 
solution provides at least the same security as if UDP queries were used.
SYN floods are not a real issue, as we do not allocate such a high 
amount of resources as real TCP would. If a TCP SYN arrives, we just 
store the necessary information in the Syncache and send a SYN/ACK, 
without allocating anything yet. After a timeout the syncache entry is 
deleted.
It should also be possible to enable SYN cookies, although we didn't 
test that.

Cheers,
Mat