<html><head><meta http-equiv="Content-Type" content="text/html charset=iso-8859-1"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Yeah. User attribution and classification of traffic. I wonder how the ISP's do it with their "unmetered" content. Must be something similar??<div><br></div><div><br><div><div>On 27/03/2014, at 6:22 PM, Geordie Guy <<a href="mailto:elomis@gmail.com">elomis@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><p dir="ltr">What's this actually for? What problem are you solving? Billing on shared platforms?</p><p dir="ltr">G</p>
<div class="gmail_quote">On 27/03/2014 5:44 PM, "Scott O'Brien" <<a href="mailto:scott@scottyob.com">scott@scottyob.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Wow, so many responses! Ok, I'll try and address them all here and explain a bit further.<br>
<br>
The system I've built does not currently use AD. Sorry, I should clarify, when I say the user-auth exchange I'm talking a RabbitMQ exchange, not AD. This is just a way RabbitMQ can route messages to queue(s) and consumer(s).<br>
<br>
With the AD topic though, netflow consumers are looking into a collection/table on a database for the user auth information (and caching IP mappings in memory from login events by directly subscribing to the user-auth fan-out exchange.) The database is fed by a different consumer logging login/logout events from a separate queue also attached to this user-auth exchange.<br>
<br>
The system I fed user auth from was exactly how Jonathan Thorpe described. I'm doing a trial of my tool on our "free wireless" network, which happens by the very nature of it to be .1x and DHCP. A log reader listens for Username->Mac events and ties them to Mac->IP events from the DHCP logs before pushing out a login event to my RabbitMQ user-auth exchange. I guess the option of doing .1x (difficult in some situations) on wired would work, but monitoring logs, captive portals, websites or user-agents could be used to feed this user-auth exchange? Basically how you put them in there is up to the environment it's placed in.<br>
<br>
Someone mentioned DB size worries and a few people have mentioned hooks/triggers in either Postgres or MySQL. I pretend to be many things in my life but I'm afraid a DBA is not one of them! The worry I had with Postgres or MySQL is that it's not easy to scale out as much as MongoDB is (feel free to flame me down on this one though.) I think as soon as I start adding hooks and pumping every single netflow entry into the database, it might start to quickly become my bottleneck. As for storage, I'm making use of MongoDB's ability to be able to most of the heavy lifting. I store a document for my counters (daily_counter, user_daily_counter, etc.) This way I can issue an upsert update command to Mongo (that is, if the record I'm looking for doesn't exist, just create it) with an "increment command" such that instead of making calls to see what the current counters are, then having to lock the tables, I can just let MongoDB worry about it in one sweep and keep my usage information cl<br>
ose to real-time. I'm storing only the counters (and a capped 5GB collection for my netflow with user column just for curiosity), meaning my storage requirements are pretty small, but if this becomes the bottleneck I can easily just chuck more nodes into my cluster and shard out my datasets fairly easily.<br>
<br>
As far as the system being able to handle a lot of throughput, I *think* I've hit the nail on the head. The BGP/Netflow collectors (pmacct suite) can be load balanced if these are a bottleneck, the message queue just needs a lot of RAM, but there a ways to scale it out, my Mongo cluster can scale out if that starts being a bottleneck, I can always spin up more consumers to process more netflow if they can't keep up with the traffic. I know with only a few consumers I can currently handle over a hundred Mbit/s of netflow, but I still need to optimise the code such that it can cache some local counters and only hit up the database in batch every 30s or so, so I don't think this will be a problem. I've been running this for the past two weeks and it seems very usable (still a few rough patches, but I'm just working on the interface now.)<br>
<br>
It's great to see so much interest in this little project! I'm going overseas for the next two weeks or so but when I get back, I'll definitely be cleaning it up a bit and putting it up on github with a blog post in the next month or so, so watch this space I guess.<br>
<br>
Thanks again,<br>
- Scotty O<br>
<br>
<br>
On 27/03/2014, at 1:35 PM, Mark Currie <<a href="mailto:MCurrie@laserfast.com.au">MCurrie@laserfast.com.au</a>> wrote:<br>
<br>
> There are UTM's which can associate data consumption to AD for a standalone business (such as the Sophos UTM or Bluecoat), I think Scotty is talking more ISP grade?<br>
><br>
> Mark Currie<br>
><br>
><br>
> -----Original Message-----<br>
> From: AusNOG [mailto:<a href="mailto:ausnog-bounces@lists.ausnog.net">ausnog-bounces@lists.ausnog.net</a>] On Behalf Of Scott O'Brien<br>
> Sent: Thursday, 27 March 2014 11:53 AM<br>
> To: <a href="mailto:ausnog@lists.ausnog.net">ausnog@lists.ausnog.net</a><br>
> Subject: [AusNOG] User-Aware Netflow<br>
><br>
> G'Day Noggers,<br>
><br>
> Long time loiterer, first time poster here. At the organisation I've been working at, we've had a requirement to attribute traffic (and the type of traffic) back to a user. Not being able to find any open source stuff to do this, I decided to build one.<br>
><br>
> I've been building a tool that makes use of pmacct to put netflow and BGP attributes (namely community and AS Path) into a central message queue (RabbitMQ). In basic, the tool is basically a set of consumers that listen on a user-auth message exchange and have access to auth history in my MongoDB cluster. When a flow comes in, I'm able to add the user who had the destination IP address at the time to the netflow record before storing it on my database and increment the appropriate counters in Mongo. I'm now working on a front-end (in Meteor) that shows information on the traffic and per user usage in near real-time.<br>
><br>
> There's a little bit of work now to abstract the tools I've built such that it's easy to use for the wider community. I'm curious, is this style of IP based user-attribution something that people want/need? How are others tackling this problem? (I know proxies are popular.) If there's a demand for it, I'll abstract it, clean it up a bit and put it up on Github but only if it's an area people have found lacking. Ideas and suggestions welcome :-)<br>
><br>
> Cheers,<br>
> - Scotty O'Brien<br>
><br>
><br>
> _______________________________________________<br>
> AusNOG mailing list<br>
> <a href="mailto:AusNOG@lists.ausnog.net">AusNOG@lists.ausnog.net</a><br>
> <a href="http://lists.ausnog.net/mailman/listinfo/ausnog" target="_blank">http://lists.ausnog.net/mailman/listinfo/ausnog</a><br>
><br>
> --<br>
> This email was Virus checked by Sophos UTM 9. <a href="http://www.sophos.com/" target="_blank">http://www.sophos.com</a><br>
> _______________________________________________<br>
> AusNOG mailing list<br>
> <a href="mailto:AusNOG@lists.ausnog.net">AusNOG@lists.ausnog.net</a><br>
> <a href="http://lists.ausnog.net/mailman/listinfo/ausnog" target="_blank">http://lists.ausnog.net/mailman/listinfo/ausnog</a><br>
<br>
_______________________________________________<br>
AusNOG mailing list<br>
<a href="mailto:AusNOG@lists.ausnog.net">AusNOG@lists.ausnog.net</a><br>
<a href="http://lists.ausnog.net/mailman/listinfo/ausnog" target="_blank">http://lists.ausnog.net/mailman/listinfo/ausnog</a><br>
</blockquote></div>
</blockquote></div><br></div></body></html>