[AusNOG] QoS on Internet traffic

Tony Miles tmiles42 at gmail.com
Wed Aug 16 21:05:14 EST 2017


Hi all,

Thank you for all of the replies I received of which the below was the only
on-list response :)

There was a fairly common theme to a lot of the replies:
* yep, I know where you're coming from
* yes, it's sucky
* no, we don't have a good solution for it either

Common suggestions on ways to improve things were:
1. Dedicated connections to content providers
2. Suggestions for firewall products that might help out
3. Trickery with firewalls/NAT and routing

To expand on each of those slightly:

1. Dedicated connections to content providers - the idea being that instead
of just purchasing transit, to set up connections (or peering) directly
with the content providers that are required. We are at peering locations
and we do have direct interconnects to some of the common providers that
our clients are accessing. We add more where it's feasible, but we can't
just add in an interconnect to every single application provider that one
of our clients wants to connect to as this is too cost intensive. It goes
someway towards solving the QoS problem though in that it makes it easier
to tag packets as they come to us, but is still not a magic bullet and
certainly has limitations as to how/when it will work when clients are free
to choose ANY of the applications that are out there.

2. Suggestions for firewall products that might help out - the list of
products suggested includes Fortigate, Barracuda, Palo Alto, Kerio,
Forcepoint & Mikrotik (hope I didn't forget any!). There are also the
suggestions for gear that could match/mark traffic such as NetEqualizer,
Exinda, Procera, Mikrotik and possibly Riverbed. All of these (with the
exception of Mikrotik) seem to be a lot more "enterprise" feature rich and
hence expensive than what we think we need. Interestingly I got almost
equal numbers of recommendations for Mikrotik and warnings to be cautious
of them ?!

3. Trickery with firewalls/NAT and routing - this includes doing things
like provisioning another port (or virtual firewall) and routing traffic
to/from only the "important" sites via this secondary path. This traffic on
the secondary path would then not be subject to the congestion that the
main traffic might be with general browsing. We actually do this already in
some cases (for some VoIP/VC stuff).


At this stage I'm probably going to focus on investigating what the
Mikrotik product can do in regard to classifying traffic (even
transparently suggested by one person) as a place to start. The main reason
is that all of the firewalls and infrastructure is currently in place and
this would be a "drop in front of the firewall" solution which would mean
not having to touch any of the existing firewall configuration. It would
also work (if it does !) where the client has there own colo'd firewall
that we don't control at all. Following this path would also hopefully tie
in with the other objective of getting the important traffic tagged with
the appropriate DSCP so that the existing WAN QoS can also apply to this
traffic. If it doesn't yield the desired responses, then I have other
options of things I can try.


Thanks again for everyone who took the time to respond, it's certainly
given me a place to start and lots of things to think about.



regards,
Tony.




On Tue, Aug 15, 2017 at 1:13 PM, James Andrewartha <trs80 at ucc.gu.uwa.edu.au>
wrote:

> Hi Tony,
>
> Apologies for top-posting but I don't want my replies to be lost in your
> excellent statement of the problem.
>
> I'm on the EDUCAUSE NETMAN (university network admins) list and the
> discussion about bandwidth shaping comes up occasionally. Since they're
> mostly in the US, a lot of them are dumping their bandwidth shapers and
> just buying bigger pipes, which is not that feasible here. The other
> options suggested are NetEqualizer which basically dynamically shapes down
> bandwidth hogs at time of congestion, Exinda which has thousands of
> application signatures, or Procera, which I get the feeling is expensive.
> There's also Qwilt, but that's more about caching Netflix so it doesn't
> hit transit. These are all roughly option #4 solutions.
>
> For option #3, Palo Alto or Forcepoint NGFW (formerly Stonesoft) might be
> worth looking at. And if you have deep pockets there's always Riverbed.
>
> On Tue, 15 Aug 2017, Tony Miles wrote:
> > The typical example might be a customer that has five sites that we
> provide a 20Mbps private WAN tail into (per site) and then
> > we have a centralised hosted firewall that all sites access the internet
> via. The speed on the central firewall might be capped
> > to something like 50M (all abbreviations using "M" refer to "Mbps"
> hereafter). The WAN we provide supports QoS so that if a
> > client has an application that is important to them it can be tagged and
> put in an appropriate queue and treated accordingly.
> > Examples of this might be that they have an RDP server at the head
> office site or they have VoIP PBX gear at each location. The
> > central Internet access is oversubscriber 2:1 in this example (100M of
> WAN tails on 50M of Internet). At this point I think this
> > is all fairly standard stuff that a lot of the people on this list would
> be familiar with (hopefully?). When I am using this
> > example, it is just an example, this is of course multiplied by the
> number of clients we have, who are all generically fairly
> > similar, but with each one having different specific details (different
> speeds, different things they consider important).
> >
> > With the move to cloud everything clients are moving from hosting stuff
> themselves (ie. on their own servers/WAN) to things that
> > are hosted generically on the Internet. This might be their accounting
> application, might be video conferencing or voip services
> > or any number of other things that for whatever reason they have chosen
> to procure "as a service" rather than buying the thing
> > and hosting it locally on premises.
> >
> > When everything is running normally and there is no excess volume of
> traffic nobody complains, but the first time
> > $someone_important is on a video conference call to an interstate office
> and the quality is crap because Windows updates are
> > sucking all of the Internet bandwidth the question then becomes "please
> fix this, we purchase a WAN with QoS". The VC one is
> > particularly nasty because the conference bridge is in the cloud and so
> a VC session between three locations that are all on the
> > same private WAN (with potentially plenty of bandwidth) is effectively
> 3x VC session to the Internet.
> >
> > Historically our answer has been "it's the Internet, there is no QoS",
> which has sufficed for a while, but it's gotten to the
> > stage where EVERYTHING is now "in the cloud" and that answer is slowly
> losing traction. This combined with the fact that others
> > out there are promising (rightly or wrongly) that they can solve the
> problem for the client and we can continue to ignore it at
> > our peril.
> >
> > I should probably add that we DO provide on-net VoIP & VC services for
> clients that we can (and do) support properly with QoS
> > but clients are free to use or not use them as they wish and there are
> any number of reasons why they might choose a different
> > Internet based provider of these services (price, features, integration,
> historical, etc). There is also the whole range of
> > other hosted applications that a client might want to access that we
> don't host internally and can't get some sort of cross
> > connect or other arrangement in place to bring the traffic in via
> something other than Internet transit.
> >
> > Our Internet topology is like this (arrows indicating inbound/downstream
> traffic flow):
> >
> > [$transit_provider] ---> [border router] ---> [core router] --->
> [firewall] ---> {private WAN}
> >
> >
> > Right now we shape outbound/egress on the core router towards the
> firewall to the speed that is purchased by the client (eg. in
> > above example 50M). It makes no difference what sort of policy we apply,
> right now it's just a plain "shape default queue to x".
> > We COULD in theory apply a proper QoS policy that puts stuff in queues
> and provides the required bandwidth to those queues. The
> > only thing preventing this is the classification of the traffic (ie. how
> to decide what goes in each queue). To do this
> > effectively would (I imagine) require something that can do L7
> inspection of traffic to see that something is
> > "https://important_site.com" and apply appropriate DSCP marking to the
> packets. This is of course something that our core
> > routers can not do (L7 classification).
> >
> > Options that I've considered:
> >
> > 1. Continue with "Internet => no QoS" - the whole point of this post is
> that this position is becoming less viable as everything
> > moves to being "cloud based" or as we like to call it "Internet hosted".
> We can continue this stance at our own peril, but we
> > all know that it is 10x easier to retain existing clients than try and
> find new ones so to retain existing clients.
> >
> > 2. increase bandwidth to the firewalls - in the above example the
> firewall bandwidth is 50M and the total of the WAN tails is
> > 100M. We could (ignoring the screams coming from the accountants for
> now) simply increase the bandwidth to each firewall so that
> > there is no longer any oversubscription (eg. 100M in my example). This
> wouldn't solve the problem however as the entirety of the
> > bandwidth to the firewall could still be consumed and not enough left
> for the "important" things. All we've done is give the
> > clients more Internet bandwidth, but not actually solved the problem. It
> also doesn't help if there is WAN congestion between
> > the sites as all Internet traffic is still going to be treated equally
> in the case of congestion.
> >
> > 3. Not shape/police to the firewall - instead use a firewall that can
> classify traffic and shape/queue outbound on it's LAN
> > interface (ie. towards the private WAN cloud). This seems attractive in
> the first instance, but there are a couple of things
> > going against it. The first is that a lot of the firewalls are provided
> as managed firewalls by us and so we control them, BUT a
> > number of clients (mostly the larger ones with their own IT resources)
> have their own firewall (hosted in our racks) that they
> > manage. Telling clients that they are required to shape their firewall
> to <speed> and not shaping it for them (upstream) seems
> > like a very trusting thing to do and I don't think that would go well
> (surely nobody would abuse it ?!). The way of preventing
> > the abuse is simplt to police inbound on the core router the LAN of the
> firewall is connected to, so that if client doesn't
> > shape to (eg.) 50M, then it gets policed to 50M anyway and their QoS
> becomes broken by the policer.
> >
> > 4. Find some device to classify traffic - ideally if we could stick a
> device of some sort between the border routers and core
> > routers that could do L7 calssification of traffic and tag DSCP
> appropriately then we could do what we need without too many
> > other changes. Does such a "thing" exist ? Can anyone point me in the
> direction of something that would do this ?
> >
> > Having the traffic classified and tagged (DSCP) is the ideal solution as
> this then allows the QoS on the WAN portion to work as
> > well. No point eliminating the firewall/Internet as the problem only to
> have the VC session be crappy because there is a file
> > transfer happening between two sites.
> >
> > Talking about firewalls, can anyone recommand a firewall that do what is
> required for option #3 above. Need something that can
> > classify traffic, tag DSCP on it and then shape/queue outbound on the
> LAN interface appropriately. Needs to be a VM device or
> > something the supports proper virtualisation for separate individual
> clients properly (and can manage clients individually as
> > well). This possibly seems like it might be the best option if we can
> find the appropriate platform to do what we require that
> > fits all of the other requirements as well.
> >
> > I think that's all I've got for now. Thanks for your patience in even
> reading this far. Happy to discuss privately with people
> > if you don't want to post something publicly.
>
>
> --
> # TRS-80              trs80(a)ucc.gu.uwa.edu.au #/ "Otherwise Bub here
> will do \
> # UCC Wheel Member     http://trs80.ucc.asn.au/ #|  what squirrels do
> best     |
> [ "There's nobody getting rich writing          ]|  -- Collect and hide
> your   |
> [  software that I know of" -- Bill Gates, 1980 ]\  nuts." -- Acid Reflux
> #231 /
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ausnog.net/pipermail/ausnog/attachments/20170816/5445e40c/attachment.html>


More information about the AusNOG mailing list