• Murali

  • Network Operating System?

    I've just begun dealing with Software Defined Networks (SDN) for my Master's thesis, and I'm experimenting on top of Floodlight, an open source OpenFlow controller from Big Switch Networks. In OpenFlow, a logically centralised entity known as the controller can control the forwarding tables of a bunch of switches which speak OpenFlow. OpenFlow applications then talk to the controller using some controller-specific API to 'program' the network (manipulate forwarding tables on the switches). The high level architecture looks something like this:

    Just like an operating system abstracts away the complexities of the underlying hardware for a user-space application, the controller abstracts away the complexities of the network for OpenFlow applications. For this reason, the controller is often referred to as a "network operating system". Applications have some API to talk to the network-OS, and it translates those APIs into OpenFlow commands that control the switches.

    For my thesis, the plan for my architecture was to have two applications that provide different services to the network, that are expected to run simultaneously. Both of them collect information from the OpenFlow switches and some other framework specific agents situated at the edges of the network to make some optimisation type decisions. But as soon as I implemented one of the applications, it was clear that I had no straightforward way of ensuring that both my applications wouldn't make decisions that counteract each other. Although I really don't like the idea of doing this, the easiest way to solve this is to wrap both applications into one. And from the looks of it, this is a problem that hasn't been solved yet.

    Controllers like NOX and Onix make the assumption that only one OpenFlow application is running on a given network at any point of time. This is a reasonable assumption from a systems perspective. But what's gotten me confused is how OpenFlow applications fit into the "SDN for enterprises" picture. I was under the impression that a network operator using a particular controller could choose between different 3rd party OpenFlow applications to handle different complexities with the network: a load balancing application from vendor A for the edge, a routing daemon application from vendor B, and so forth. While these are relatively orthogonal applications, it looks like it's possible for two OpenFlow applications to make decisions and choices that adversely affect each other (leading to oscillations in switch state). Floodlight allows you to run multiple applications at the same time, but leaves it to the developer (or user?) to ensure that applications can safely co-exist with each other.

    So again, if my observation isn't mistaken, how do OpenFlow applications fit cleanly into the SDN ecosystem?  How can I manage my network using building blocks of applications from different vendors? Will I need to rely on OneBigApplianceFromBigBadVendor per network? Does this necessitate something analogous to per-process resource allocation as in traditional operating systems? I can see that FlowVisor style slicing is one way to go about it, but will that suffice?

    So what *should* the network operating system do here? Let the applications run wild and fight it out? Or provide some mechanism to enforce policies between applications?

    If I am indeed mistaken in my assumption, please do let me know what I'm missing here! :)

  • 11 days to WNS3 2012

    If you're not aware already, the 4th International Workshop on NS-3 (WNS3) 2012 is just around the corner. We're almost done with the organising and have a very interesting lineup of presentations. Tom Henderson and Mathieu Lacage will be giving keynote talks. We then have 12 full paper presentations spread across the day, and a shared poster session with the Omnet++ workshop for which we have 8 posters/demo submissions from our side. Have a look at the final program here.

    If you're an ns-3 user, don't miss this chance to share ideas and learn more about what researchers from around the globe are upto with the project. Don't forget, there's also the developers' meeting the next day. Just add yourself to the wiki if you're interesting in attending in person or remotely.

    Now if only I can find a damn youth hostel in Desenzano...

  • tcp_probe no workey?

    tcp_probe is a very handy utility in Linux to observe TCP flows. Implemented as a Linux kernel module, all you have to do is:

    $: modprobe tcp_probe [port=]

    $: cat /proc/net/tcpprobe > logfile.out

    ...and you'll get clear statistics about whatever TCP flows go through 'port' (and of all TCP flows if you specify the port as 0).

    Earlier today, I ran into a situation where tcp_probe would stop logging flows after a couple of seconds, and it always seemed to be around the 10th second in an experiment I was trying (which involved iperf-ing over a wireless link). Some quick searches made it clear that others were encountering it as well, but no one really had a solution for them. Odd.

    And what do you do when Google can't help you find the answer? You look through the source code of course!

    Within a few seconds of going through tcp_probe.c, I had my answer before me. Have a look at lines 47-49 and you'll know what was wrong in my case.

    static int full __read_mostly;

    MODULE_PARM_DESC(full, "Full log (1=every ack packet received, 0=only cwnd changes)");

    module_param(full, int, 0);

    In short, I was on a wireless channel without any nodes apart from my wireless interface and the access point, and my congestion window size was getting maxed out around the 10th second. Since tcpprobe by default logs a packet only if the congestion window size changes, I couldn't see any more packets of the flow I was looking at in /proc/net/tcpprobe.

    So the solution?

    $: modprobe tcp_probe <args> full=1

    Note that you might want to look at the log buffer size parameter (bufsize) as well, because tcp_probe happily ignores packets once your log buffer is filled.

  • Fun with TCP CUBIC

    I've been working with the folks at Deutsche Telekom Laboratories for my Masters thesis, where I'm dealing with software defined networking, wireless networks, and unicorns. It feels nice to be in a place where people understand both distributed systems AND networks (I still don't understand why there isn't an overlap there, but heck). From discussions on rate adaptation around my desk, and all the way up to distributed state consistency around the coffee machine, this place has it all. I'm having a gala time here worrying about the registers on a wireless card, wading through the zillion 802.11 amendments, hacking device drivers, and pretty much analysing interactions at every layer of the network stack.

    My work flow has mostly involved taking measurements to make sure that the problem I'm trying to solve is indeed a problem, trying to reason about the resulting measurements in terms of protocol behaviour/interleaving, and exploring what I can do to improve these measurements in a manner that is practical. As part of the work however, I needed some baseline measurements for TCP behaviour so that at the end of my thesis, I can say something like, "See? After what I did, we now have MOAR TCP AWESUM!!111"

    Of course, TCP and wireless are two things that don't really get along well. TCP sees packet loss and delays as "congestion", and depending on the congestion avoidance algorithm, will back down upon hitting such a situation. However, wireless networks suck (and will continue to suck). You can make it suck less probably, but it'll always stay above the suck-thresh. Interference in the physical channel will always be there (like memories of The Phantom Menace, which will haunt us forever). Furthermore, if there's another station on the same channel communicating over a lower bit rate, your station's performance can be degraded real bad (known as the 802.11 performance anomaly). This means that Murphy in the physical layer (only your first hop in most cases!), can end up causing TCP to believe that there is congestion at any of the many links that lead to the destination, and actually back down.

    Linux uses TCP CUBIC as the default congestion control algorithm since kernel version 2.6.19. Without going into details of how the different TCP thresholds work in CUBIC, let's look at how its congestion window (CWND) curve looks like.

    The above is a measurement of the CWND size on a per packet basis, for a flow generated using iperf. The test network comprised of my laptop running the iperf client with an access point one hop away, bridging all traffic to a server behind it via a switched LAN. I'm sharing a wireless channel with a whole bunch of other nodes. The red line is the CWND size, and the green line is the slow-start threshold (ssthresh) value. As for some quick TCP background, the red line indicates how many bytes TCP is willing to keep outstanding, and the green line indicates the threshold at which TCP says, "Ok, I think I should not get too greedy now, and will back-out if I feel the network is congested".

    The graph above is a typical TCP CUBIC graph. When the congestion window (red) is less than the ssthresh value (green), the CWND increases in a concave manner (really fast). This can be seen by the near vertical spikes around seconds 3, 9, 35, and 37. When the CWND value is higher than the ssthresh value, TCP CUBIC plays safe, and tries to increase very slowly. For instance, look at how slowly the stretch between 22 and 30 is growing.

    As you can see, TCP is detecting congestion multiple times, thanks to a noisy channel, and adjusts itself accordingly.

    Next, we look at the curve in a channel where there is no contention at all (that is, no else on the channel except my wireless interface, and the access point I'm talking to).

    Clearly there is no congestion here. The CWND value crosses the slow-start threshold very early (around the 1st second), plateaus since that point, and then goes convex around the 5th second. It then keeps probing for more bandwidth, finds it, and increases the window size steadily over time.