[AusNOG] QoS on Internet traffic

Paul Vinton paul at vintek.net
Tue Aug 15 14:12:26 EST 2017


Hi Tony,

Suggest a Fortigate.  Depending on speed, but the 81 series or 91 series.

Thanks,


Paul

Sent from my iPad


Paul Vinton
Director
Email: paul at vintek.net

[cid:1527617_eoy_CR_technology_38a5b969-ff67-45cf-91f6-d95924102cdb.jpg]
[cid:vintek_logo_7667a27c-9e2e-4eca-a343-ebf39c15d043.jpg]      Visit us at www.vintek.net<http://www.vintek.net/>
Level 1, 108 King William Street, Adelaide, South Australia 5000
Phone: 1300 001 337  Fax: (08) 8125 2985
'Get it right first time'

This email, including any attachments, may contain confidential and privileged material for the sole use of the intended recipient; any other use is prohibited. If you are not the intended recipient, please contact the sender by reply email and delete all copies of this message. All prices quoted include GST. Prices and specifications subject to change without notice. E&OE. Full email disclaimer at www.vintek.net/disclaimer<http://www.vintek.net/disclaimer>
On 15 Aug 2017, at 2:07 pm, Tony Miles <tmiles42 at gmail.com<mailto:tmiles42 at gmail.com>> wrote:

Hi all,


I'm not sure if anyone else is having this issue, but we are recieving an increasing number of request to give priority/preference to specific Internet traffic.

Apologies in advance for the lengthy post.

The typical example might be a customer that has five sites that we provide a 20Mbps private WAN tail into (per site) and then we have a centralised hosted firewall that all sites access the internet via. The speed on the central firewall might be capped to something like 50M (all abbreviations using "M" refer to "Mbps" hereafter). The WAN we provide supports QoS so that if a client has an application that is important to them it can be tagged and put in an appropriate queue and treated accordingly. Examples of this might be that they have an RDP server at the head office site or they have VoIP PBX gear at each location. The central Internet access is oversubscriber 2:1 in this example (100M of WAN tails on 50M of Internet). At this point I think this is all fairly standard stuff that a lot of the people on this list would be familiar with (hopefully?). When I am using this example, it is just an example, this is of course multiplied by the number of clients we have, who are all generically fairly similar, but with each one having different specific details (different speeds, different things they consider important).

With the move to cloud everything clients are moving from hosting stuff themselves (ie. on their own servers/WAN) to things that are hosted generically on the Internet. This might be their accounting application, might be video conferencing or voip services or any number of other things that for whatever reason they have chosen to procure "as a service" rather than buying the thing and hosting it locally on premises.

When everything is running normally and there is no excess volume of traffic nobody complains, but the first time $someone_important is on a video conference call to an interstate office and the quality is crap because Windows updates are sucking all of the Internet bandwidth the question then becomes "please fix this, we purchase a WAN with QoS". The VC one is particularly nasty because the conference bridge is in the cloud and so a VC session between three locations that are all on the same private WAN (with potentially plenty of bandwidth) is effectively 3x VC session to the Internet.

Historically our answer has been "it's the Internet, there is no QoS", which has sufficed for a while, but it's gotten to the stage where EVERYTHING is now "in the cloud" and that answer is slowly losing traction. This combined with the fact that others out there are promising (rightly or wrongly) that they can solve the problem for the client and we can continue to ignore it at our peril.

I should probably add that we DO provide on-net VoIP & VC services for clients that we can (and do) support properly with QoS but clients are free to use or not use them as they wish and there are any number of reasons why they might choose a different Internet based provider of these services (price, features, integration, historical, etc). There is also the whole range of other hosted applications that a client might want to access that we don't host internally and can't get some sort of cross connect or other arrangement in place to bring the traffic in via something other than Internet transit.

Our Internet topology is like this (arrows indicating inbound/downstream traffic flow):

[$transit_provider] ---> [border router] ---> [core router] ---> [firewall] ---> {private WAN}


Right now we shape outbound/egress on the core router towards the firewall to the speed that is purchased by the client (eg. in above example 50M). It makes no difference what sort of policy we apply, right now it's just a plain "shape default queue to x". We COULD in theory apply a proper QoS policy that puts stuff in queues and provides the required bandwidth to those queues. The only thing preventing this is the classification of the traffic (ie. how to decide what goes in each queue). To do this effectively would (I imagine) require something that can do L7 inspection of traffic to see that something is "https://important_site.com" and apply appropriate DSCP marking to the packets. This is of course something that our core routers can not do (L7 classification).

Options that I've considered:

1. Continue with "Internet => no QoS" - the whole point of this post is that this position is becoming less viable as everything moves to being "cloud based" or as we like to call it "Internet hosted". We can continue this stance at our own peril, but we all know that it is 10x easier to retain existing clients than try and find new ones so to retain existing clients.

2. increase bandwidth to the firewalls - in the above example the firewall bandwidth is 50M and the total of the WAN tails is 100M. We could (ignoring the screams coming from the accountants for now) simply increase the bandwidth to each firewall so that there is no longer any oversubscription (eg. 100M in my example). This wouldn't solve the problem however as the entirety of the bandwidth to the firewall could still be consumed and not enough left for the "important" things. All we've done is give the clients more Internet bandwidth, but not actually solved the problem. It also doesn't help if there is WAN congestion between the sites as all Internet traffic is still going to be treated equally in the case of congestion.

3. Not shape/police to the firewall - instead use a firewall that can classify traffic and shape/queue outbound on it's LAN interface (ie. towards the private WAN cloud). This seems attractive in the first instance, but there are a couple of things going against it. The first is that a lot of the firewalls are provided as managed firewalls by us and so we control them, BUT a number of clients (mostly the larger ones with their own IT resources) have their own firewall (hosted in our racks) that they manage. Telling clients that they are required to shape their firewall to <speed> and not shaping it for them (upstream) seems like a very trusting thing to do and I don't think that would go well (surely nobody would abuse it ?!). The way of preventing the abuse is simplt to police inbound on the core router the LAN of the firewall is connected to, so that if client doesn't shape to (eg.) 50M, then it gets policed to 50M anyway and their QoS becomes broken by the policer.

4. Find some device to classify traffic - ideally if we could stick a device of some sort between the border routers and core routers that could do L7 calssification of traffic and tag DSCP appropriately then we could do what we need without too many other changes. Does such a "thing" exist ? Can anyone point me in the direction of something that would do this ?


Having the traffic classified and tagged (DSCP) is the ideal solution as this then allows the QoS on the WAN portion to work as well. No point eliminating the firewall/Internet as the problem only to have the VC session be crappy because there is a file transfer happening between two sites.


Talking about firewalls, can anyone recommand a firewall that do what is required for option #3 above. Need something that can classify traffic, tag DSCP on it and then shape/queue outbound on the LAN interface appropriately. Needs to be a VM device or something the supports proper virtualisation for separate individual clients properly (and can manage clients individually as well). This possibly seems like it might be the best option if we can find the appropriate platform to do what we require that fits all of the other requirements as well.


I think that's all I've got for now. Thanks for your patience in even reading this far. Happy to discuss privately with people if you don't want to post something publicly.


Thanks again,
Tony.

_______________________________________________
AusNOG mailing list
AusNOG at lists.ausnog.net<mailto:AusNOG at lists.ausnog.net>
http://lists.ausnog.net/mailman/listinfo/ausnog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20170815/893c8db7/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1527617_eoy_CR_technology_38a5b969-ff67-45cf-91f6-d95924102cdb.jpg
Type: image/jpeg
Size: 29786 bytes
Desc: 1527617_eoy_CR_technology_38a5b969-ff67-45cf-91f6-d95924102cdb.jpg
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20170815/893c8db7/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vintek_logo_7667a27c-9e2e-4eca-a343-ebf39c15d043.jpg
Type: image/jpeg
Size: 3210 bytes
Desc: vintek_logo_7667a27c-9e2e-4eca-a343-ebf39c15d043.jpg
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20170815/893c8db7/attachment-0001.jpg>


More information about the AusNOG mailing list