10Gbit Intel NICs and pfSense

As 10Gbit is becoming more and more common in large networks to cope with the ever increasing amount of data that is moved between service providers and users I have taken some time to look into what others have done when working with pfSense. I’ve limited my research to systems based on Intel 10Gbit NICs as they’re the most cost effective – high performance at low cost – and I’ve really good experiences with the 1Gbit Intel NICs.

First off the NICs have to be compatible with pfSense which is based on FreeBSD. When looking at [1] Intel provides a driver for the NICs based on the 82598 and 82599 chips, as well as the X540 cards. This indicates that there are some cards out there which provides 10Gbit capabilities for FreeBSD (from FreeBSD 7.3 and above) and therefore also pfSense. However when looking at the available configurations it only possible to get two interfaces in one NIC with multimode fiber or copper, but not with single mode fiber. However when looking at the price of the single mode fiber variations kind of exclude them as they’re twice the price of that copper variations; even with two interfaces. On a final note the Intel 10Gbit NICs all require PCIe 2.0 or higher as a minimum, which excludes a lot of servers that I’ve available, because they are based on the Intel 5000 chipset.

When looking at the pfSense forums there has been several dicussion recently on the Intel NICs, especially the X520 and X540, and from the debate it seems clear that running pfSense straight on hardware doesn’t work well with pfSense 2.1.x, there is a problem with the MBUF of the driver for the NICs [2]. A solution proposed to solve problem is to use VMware ESXi to hypervise the hardware and “mask” the NICs as VXNET3 interfaces and then install pfSense on top of that. A guide to do this can be found in [3]. The writer of the guide reports that transfering 7Gbit of traffic distributed over 132 VLANs and ~160 subnets without any problems [4]. From what I’ve been able to gather from the pfSense forum the MBUF error wouldn’t be corrected in the 2.1.x branch as it’s based on FreeBSD 8.3, but the 2.2 version of pfSense will address this problem as it will be based on FreeBSD 10.0, where the problem with the drivers have been corrected. Unfortunately pfSense 2.2 is only in Alpha at the time of writing and therefore unlikely to reach Stable status before the end of this year.

References
 

NPF#14: Internet bandwidth load

So first let me give you some facts on the load on our internet line at NPF#13, we had an average load of 600Mbps and a peak of just over 1Gbps, which caused some latency spikes saturday evening after the stage show. The number of participants was 1200 people and we had a single pfSense routing the traffic, with a second ready as backup. With these fact in mind and a ticket sale of 2200 for NPF#14, which is a 83% increase from NPF#13, we would expect that 2Gbps on the internet line should be sufficient. More precise my expectation on the load was approximately 1.1Gbps on average and a peak of about 1.8Gbps, but as we’ll see in a bit this was way off the actual measured values.

Luckily we had four 1Gbps lines, well due to time pressure and my underestimation of number of crew (network technicians) needed we only had single line up and running at the time of opening. But my hope was that we in time could get the other lines up and running before the internet bandwidth became a problem. Unfortunately due to other problems with network it wasn’t until 9 o’clock I had time to work on the second gigabit line and by that time the first line was under pressure especially after the stage show and at the start of the tournaments.

First WAN link load

First WAN link load

This unfortunately forced the game admins of League of Legends and Counter Strike: Global Offensive to postpone the tournaments until we had more bandwidth as Riot Games (League of Legends) and Valve (Counter Strike: Global Offensive) had released updates during that evening and people weren’t able to download them fast enough. After spending the better part of 4 hours – with a lot of interruptions – I finally got the second gigabit line up and running. This is where the funny part comes, namely that it took less than 30 seconds before the second line was fully utilized. People luckily did however report that there speedtests on http://speedtest.net/ increased from 0.4Mbps to approximately 25Mbps.

Second WAN link load

Second WAN link load

This however didn’t solve the bandwidth problem completely, and I started setting the third and fourth pfSenses up, however with one major difference namely that we, BitNissen and I, found that it’s possible to extract the configuration of one pfSense and load it into another pfSense. All that there were left for us to do was to make some adjustments to the public IP values, this didn’t succeeded in the first attempt because we missed some things which forced us to do some manual troubleshooting to correct the mistakes. We learned from these mistakes and the fourth pfSense was configured in less 10 minuttes. By the time we got to this point it was 4 o’clock in morning and the load on the WAN links had dropped considerably, therefore I decided to get a couple of hours sleep before connecting it to our network, before the tournaments were set start again (saturday 10 o’clock). Around 9:30 the last two pfSenses were connected to our network and from that point on we didn’t have any problem with the internet bandwidth. As mentioned in another blog, NPF#14: Network for 2200 people, we ended up having 23TB of traffic in total on the WAN interfaces of our core switches, and that is in only 47 hours. Below you can see the average load on the four WAN links.

Load on the different WAN links

Load on the different WAN links

Finally I will just mention that we saturday evening measured a bandwidth load peak of 2.8Gbps and on average use just over 2.1-2.2Gbps during normal gaming hours, which was a lot more than originally expected. So I’m guessing next year that we’ll need a 10Gbps or large line in order not to run out of bandwidth, if we choose to expand further.

NPF#14: Network for 2200 people

So let me be the first to acknowl that this year the network course more problems than what is good and much of it properly could have been caught with more preparation. The three main problems where the internet connection, DHCP snooping on the access switches, and the configuration of the SMC switches used as access switches. On that note let me make it clear that the two first problems where solved during the first evening and the last problem wasn’t solved completely only to the best of our ability. The general network worked very well and we didn’t have any problems with our Cisco hardware after friday evening, apart from a single Layer 1 problem – where someone had disconnected the uplink cable to a switch.

The size of network and traffic makes it equivalent to a medium sized company. We have 2200 participants with relative high requirements to bandwidth and not to mention several streamers. This puts so pressure on the distribution and core hardware. A typical construction to obtain highest possible speed is to use what I would call a double star topology. The first star topology is from the core switches/router to the distribution switches and the second star topology is from the distribution switches to the access switches. This is what DreamHack uses with some built-in redundancies. We’ve however chosen a different topology for our network, namely what I would call a loop-star topology. The top topology is a loop where the core and distribution switches/routers are connected in one or more loop(s), and the bottom topology is a star from the distribution switches to the access switches. The top topology does of course places some requirements to the core and distribution in terms of routing protocols, if you choose to use layer 3 between them as we did.

Double star and Loop-star topologies

Double star and Loop-star topologies

The protocol we choose for our layer 3 routing was OSPF, as the original design included some HP ProCurve switches for distribution and EIGRP is Cisco proprietary and those not really an option. One of the remaining possibilities was RIP, but this protocol doesn’t propagate fast enough in a loop topology and those isn’t really an option either. The reason for choosing the loop topology is quite simply redundancy in the distribution layer and the idea of doing it this way is taken from the construction of the network made for SL2012 (Spejdernes Lejr 2012 – http://sl2012.dk/en), where I helped out as a network technician. The major differences from SL2012 to NPF#14 is the number of loops and bandwidth between the distribution points – SL2012 had a single one gigabit loop and two places where the internet was connected to the loop, while we at NPF#14 had two loops, one with two gigabits for administrative purposes and one with four gigabit for the participants, and only a single place where four gigabit internet was connected. Our core consisted of two Cisco switches in stack, each of the participants distribution switches was a 48-port Cisco switch and each of the administrative distribution switches was a 24-port Cisco switch. For the internet we had four 1 gigabit lines which each where setup on a pfSense for NATing on to the different scopes of the internet-lines, and we then used the Cisco layer 3 protocol for load-balancing the four lines. Finally everything from the distribution switches to the access switches is layer 2 based, and this also the way we control the number of people on the same subnet and on the same public IP as we don’t have a public IP per participant.

But enough about the construction of our network let us take a look at the traffic amounts over the 47 hours the event lasted. First of the four internet lines moved 23TB of data, of which 20.2TB was download and 2.8TB was upload. The amount of traffic moved through the core on the participant loop was 21.48TB, 12.9TB on one side and 8.58TB on the other side. The administrative loop on the other hand only moved 1.37TB of data through core switch. With a little math we get that the average load on the participant loop respectively is 625Mbps and 415Mbps, while the administrative loop load average is 66.3Mbps. The average load on the internet from the core was 1.114Gbps, with a measured peak of 2.8Gbps. In the picture below the data is placed on the connections between the switches.

Traffic on different interfaces

Traffic on different interfaces

We have learned a thing or two from this year’s construction of network, layer 3 routing between distribution points and a multi-loop topology can work for a LAN Party of our size. Furthermore we might be able to reduce the size of the bandwidth on loops a bit, the average loads are fairly low – but this doesn’t say anything about peak load performance which we encounter friday and saturday evening. Therefore for good measure the bandwidth on the different types of loops should properly just stay the same. Depending on the growth next year it might be a reasonable idea to make two participant distribution loops instead of just one and maintain the bandwidth of 4Gbps on both loops. Also depending on growth the bandwidth on the internet has to follow, we’re fairly close to our limit this year.

Heartbeat service setup on Linux

In this blog I’m going to tell you how to setup heartbeat service on your machine to keep track if it up and running or for how long it has been down. To see if your machine is running or how long time since it made its last beat you can go to http://services.bech.in/heartbeat/. In order to add your server to the list please contact me and I will provide you with an ID.

For this to work the server needs the Java Runtime Enviroment (JRE) or some sort that being Oracle JRE or OpenJDK. You are going to need two files: script.sh and Beat.class. You can download them into using the following, it is important that you place them into the correct folder so the first change directory need to be right or it will not work!

cd ~/
wget http://services.bech.in/heartbeat/script.sh
wget http://services.bech.in/heartbeat/Beat.class

Next up we need to create the log-file there errors are write if any happen. To create the log-file use the touch-command:

touch heartbeat-log.txt

Now there are to things we left we need to do, first we need to edit the script to use your machine’s ID, so open the script.sh-file with you favorite text-editor and change the #Your ID# to the ID I have provided for you. If the ID is abc123 the line in the script.sh-file should look like this:

java Beat abc123

The finally thing we need to do is to add a cron job to the crontab of Linux, to do this simply open the crontab with the following command:

crontab -e

If it is the first time you use it will ask to select a text editor. Now add the following line to end of the file on a new line:

*/1 * * * * ~/script.sh | tee -a ~/heartbeat-log.txt

Save and close the file and you should get a response from cron like:

crontab: installing new crontab

Now go to http://services.bech.in/heartbeat/ to see if your machine is able to make a ping else consult the log-file. The page automatically updates every 10 seconds and the beat is made every minute with the above cron job.

Setting up network connection on NFIT cable network from Ubuntu CLI

First you need to register your MAC address to be able to use the cable network at NFIT. The simple solution is to send a mail to Michael Glad with our MAC-address and NFIT username if you already know the MAC address. However if you want to help yourself and lighten the workload on Michael Glad you simply setup a DHCP-client on your server and register using a web browser. In the case that you installed Ubuntu server where there isn’t any default web browser installed and you are limited to text versions like Lynx. It should be noted that as long as you haven’t registered your device you wouldn’t be able to do anything outside of the LAN, this means no apt-get or alike! So therefore you can’t even install Lynx that way around. First of you need to setup the DHCP-lease and be able to ping your default gateway or Google DNS (that service is open).

If there is only a single NIC this fairly simple – the reference you need for the first part is namely “eth0”. However with multiple NICs and maybe even multiple ports per NIC, this becomes a little more tricky. The way to solve this is to use some of the basic tools available, in this case the lshw command. Depending on how many NICs and ports your machine has it might be a good idea to pipe the output into a separate file as it might not all fit into one screen. So first of you need to connect the network cable to the machine and then execute the following command in your terminal:

ip addr > network.txt

This will take the standard output from the command and write it to a file instead of displaying it on the screen. Next up we will need to find the line there says that the port is physically UP, it looks something like:

2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000

Now that we know which reference we need, e.g. “eth0”, we can edit the interface setting of your system. This is done in the /etc/network/interfaces-file. Use any text editor to edit the file, i.e.:

vi /etc/network/interfaces

Add the following code to the file if the reference is “eth0”, otherwise change of a different reference:

auto eth0
iface eth0 inet dhcp

Finally to get a DHCP-lease you need to restart your network module of the server which can be done with the following command:

sudo /etc/init.d/networking restart

When the restart is complete you should be able to ping the default gateway and/or Google DNS (8.8.8.8 or 8.8.4.4). Next you need to install some text based web browser like Lynx, this requires you to have a internet connection on some different machine and a USB-stick. You need to download a .deb-file of Lynx which can be found at http://pkgs.org/ubuntu-12.04/ubuntu-main-amd64/lynx-cur_2.8.8dev.9-2_amd64.deb/download/

Now download the file and transfer it your USB-stick (has to be FAT or alike for your Ubuntu Server to be able open it). To access the USB-stick on the server you first need to mount which requires a little work:

sudo mkdir /media/external/
sudo mount -t vfat /dev/sdb1 /media/external -o uid=1000,gid=1000,utf8,dmask=027,fmask=137

The code above assumes that USB-stick in the sdb1 in dev and that it uses a FAT filesystem, if this is not the case please consult the Ubuntu Help pages. To unmount the USB-stick again use the following command:

sudo umount /media/external

Now copy the .deb-file on to the server, i.e. your home folder, and execute it:

cp /media/external/lynx.deb /home/rewt/lynx.deb
cd /home/rewt/
sudo dpkg -i lynx.deb

Now that you have install Lynx all you need to do is start with some random target like Google and follow the instruction on the screen from NFIT.

lynx google.dk

After the recommended reboot of the system should you now be able to go any website with Lynx, use apt-get and other services.