My Home Network

June 26, 2016

Recently I decided it was time to rebuild my home network. It's something I've had in the back of my mind for years now, but the actual trigger was my previous router/AP becoming unstable.

One of the big reasons for doing this was paranoia. I am some-what terrified of the Internet of Things. I can tell myself "I'll just not buy those devices.", but the reality is I have family that visits, neighbors, and a wife that thinks I'm a nut job. Strange things will make it on to my network.

Requirements

Diagram and High Level Design

At a high level this is what my network looks like. I left off most of the actual devices as they cluttered the diagram.

+-------+
|       |         {{{{{{{{}}}}}}}}
| Modem |---------{{{ INTERNET }}}
|       |         {{{{{{{{}}}}}}}}
+-------+
   |
   |  vlan4
   | untagged
   |
+-------+          +---------+
|  7   1|  vlan1   |Unmanaged|----[[ file server ]]
|      2|----------|  Switch |
|       | untagged |         |----[[ workstation ]]
|       |          +---------+
|Managed|
| Switch|
|       |          +-------+
|       |  vlan2   |       |
|      3|----------| AP-1  |
|      4| untagged |       |
|       |          +-------+
|       |
|       |
|       |          +-------+
|       |  vlan3   |       |
|      5|----------| AP-2  |
|  8   6| untagged |       |
+-------+          +-------+
   |
   | all vlans
   |  tagged
   |
+-------+
|       |
| Router|
|       |
+-------+

I ended up with 4 VLANs and they're use is as such:

Some interesting bits of information

The Hardware

Switches

It was obvious I needed a managed switch. I was pleasantly surprised to find that 8-port managed switches were very affordable.

I used a Netgear GS108E, which can be had for under $50. It met the requirements; simple, gigabit, 8-port, and VLANs. If I need more ports I attach unmanaged switches to it. This means it's the "core" of my network. Everything branches off of it. Hence why it's in the middle of the diagram above.

Access Points

I probably spent 10 times as long shopping for an AP than I did for any other piece of hardware. There are various reasons for that; I didn't know shit about APs, there is a ton of competition, I didn't know what the most affordable setup would be. I could have bought a couple of "wireless routers" and put them in "wireless only" mode, but that would not have been cost effective.

What I found is there are many things known as "range extenders". The vast majority of these will also function as dumb access points. Finding this out led me to adding a new requirement; the wall-wart AP. The only wire I wanted to have was the backhaul.

I eventually found the Netgear EX3700. They run about $50 as well. They're not perfect and there are definitely better models in the product family. But they serve my purpose well. I knew I would be buy two APs so I knew the wireless load would be split.

As a side note, the double AP model is great. Streaming Netflix on one AP and doing heavy internet traffic on my laptop on the other AP has been awesome. I don't notice anything hogging the WiFi anymore.

Router

The router - the most interesting part of most networks. This is what drove the design of the network. I pretty much knew what I wanted right away; an SoC that can run vanilla Linux and/or BSD. I spent a ton of time looking at dual NIC SoCs. I didn't find any for a reasonable price and still in production that met my fancy.

Eventually it dawned on me. I did not need dual NICs for a router. I can just use VLANs and traffic can go right back out the same interface it came in on. This immediately led me to some of the hobbyist SoCs; Raspberry Pi, Banana Pi, Odroid, etc.

I chose the Odroid-C1+. It's quite a powerful device for $35 that has good support from both Linux and NetBSD. It turns out that the specs are way, way more than I needed. Even when I'm maxing out my internet connection (50mbps) it's mostly idle and only using a fraction of the RAM.

The main reason I chose the Odroid-C1+ over the RPi2 is that the RPi2 has notoriously bad network performance.

One big downside to the Odroid-C1+ is that it's not easy to obtain in the USA. There is an official US distributor, Ameridroid, but they want you to pay via paypal. I really dislike paypal and opted to spend a little extra money on one of the bundles that Ameridroid lists on Amazon.

BTW, there is a great wikipedia page with a list of SoCs. I spent a ton of time looking at this list.

Routers OS

I'm a big fan of NetBSD. Part of the reason I chose the Odroid-C1+ is because it's supported by NetBSD. It also gave me an opportunity to try out NPF. I ended up not needing to install any extra software. Just the base of NetBSD was enough.

Packet Filtering Rules

I grouped each VLAN and gave them their own set of rules. For the most part the "trusted" VLANs run wide open. I do quite a bit of filter on the untrusted ones though.

Below are snippets from my /etc/npf.conf. There are also some blurbs explaining them.

global

$int1_if = "vlan1"
$int2_if = "vlan2"
$int3_if = "vlan3"
$ext_if  = "vlan4"

$myserver = 10.169.1.11
$myserver_ssh_port = 54321

These are just variables for convenience. They're not necessary, but useful for semantic meaning. i.e. "ext_if" is the external facing interface.

# NAT: internal outgoing to external (source NAT)
#
map $ext_if dynamic 10.169.1.0/24 -> inet4($ext_if)
map $ext_if dynamic 10.169.2.0/24 -> inet4($ext_if)
map $ext_if dynamic 10.169.3.0/24 -> inet4($ext_if)

Basic IPv4 NAT. One line for each interface/subnet.

# NAT: external incoming to internal (destination NAT)
#
map $ext_if dynamic $myserver port 22 <- inet4($ext_if) port $myserver_ssh_port

NAT from external to internal IP. This opens a hole for SSH from the internet to myserver.

# Also map some things on internal VLANs for convenience
# Allows us to use the external IP/DNS from inside and still get to the right place
#
map $int1_if dynamic $myserver port 22 <- inet4($ext_if) port $myserver_ssh_port
map $int2_if dynamic $myserver port 22 <- inet4($ext_if) port $myserver_ssh_port

The comment explains this fairly well. The NAT logic appears to happen on ingress only, so I had to add lines for the internal VLANs.

vlan1

group "wired" on $int1_if {
        pass all
}

Nothing to see here. Wired is trusted. Pass all traffic in and out with no filtering.

vlan2

group "wireless" on $int2_if {
        # Lock down wireless by blocking everything except
        # for stuff we explicitly allow below
        #
        block in all
        pass out all

        # DNS
        #
        pass in to any port 53

        # ICMP/ping
        #
        pass in proto icmp to any

        # Allow some well known things
        #
        pass in proto tcp to any port http
        pass in proto tcp to any port https
        pass in proto tcp to any port ssh
        pass in proto tcp to any port ftp

        # IMAP/SMTP - secure ports only
        #
        pass in proto tcp to any port 993
        pass in proto tcp to any port 587
        pass in proto tcp to any port 465

        # SSH to our external IP
        #
        pass in proto tcp to inet4($ext_if) port $myserver_ssh_port

        # 8080 is often used for testing a webserver
        # since it's in the non priviledged range
        #
        pass in proto tcp to any port 8080

        # Allow access to fileserver
        #   samba, netbios
        #
        pass in proto tcp to $myserver port 445
        pass in proto tcp to $myserver port 139

        # UDP ports for vpnc
        pass in proto udp to any port 500
        pass in proto udp to any port 4500
        pass in proto udp to any port 10000
        pass in proto tcp to any port 10000

        # FIXME: Pass everything until I figure out how to deal with passive mode FTP.
        #
        pass in all
}

As you can see, I initially tried to filter most things for my "trusted" wireless. However, passive TCP is completely broken by this. That's because it initially opens a connection to the server on port 23, but then it opens another socket on a mostly arbitrary port to handle the actual file transfer. So I'm currently passing everything on the wireless as well - at least until I figure out what to do about passive FTP.

vlan3

group "untrusted" on $int3_if {
        # Lock down by blocking everything except
        # for stuff we explicitly allow below
        #
        block in all
        pass out all

        # DNS
        #
        pass in to any port 53

        # ICMP/ping
        #
        pass in proto icmp to any

        # Allow some well known things
        #
        pass in proto tcp to any port http
        pass in proto tcp to any port https
        pass in proto tcp to any port ftp

        # IMAP/SMTP - secure ports only
        #
        pass in proto tcp to any port 993
        pass in proto tcp to any port 587
        pass in proto tcp to any port 465
}

This represents the most locked down part of the network. It only allows a few things through. This is the network that is used by; guests, TVs, tablets, etc. Right now only wireless things are on this network, but there is no reason wired devices can't be part of it as well.

vlan4

group "external" on $ext_if {
        pass stateful out final all

        # Block everything from the internet,
        # with exceptions for things we expose
        #
        block in all
        pass stateful in proto tcp to inet4($ext_if) port $myserver_ssh_port
}

First line here says we pass all traffic outgoing to the internet. The "stateful" keyword is very important. It tells NPF to track the TCP/UDP connections through the firewall so it automagically passes it through on the way back.

The second part blocks all incoming traffic form the internet with exceptions for SSH. Note we don't have to anything to allow active TCP/UDP sessions. That was handled by the "stateful" keyword on the outbound line.

Results

It has been almost three months running with this setup. So far it has been pretty stable and has treated me well. The router especially has been surprisingly hands-off. I've not touched it much since I first configured it to my liking.

The other day I had a brief power outage. To my delight everything came back up and I was browsing the internet a minute or two after power returned.

Pain Points

There are however a couple remaining pain points. I touched on some of these above.

  1. Passive FTP

    I'd like to lock down all the wireless, but I fully opened up the "trusted" wireless due to passive FTP. It took a few weeks for me to notice FTP was an issue. I finally notice when was trying to install something from pkgsrc. I may just give up on passive FTP and let it be blocked. Most things, pkgsrc included, will fallback to a HTTP mirror.

    Update, August 6: I simply gave up on passive FTP. My trusted wireless is now fairly locked down.

  2. IPv6

    This is more of a TODO item. Allegedly my ISP supports IPv6, but the router is not getting an IPv6 address. I've not debugged it. This may be a knob I need to twiddle on NetBSD or my router may not be getting Router Advertisements from my ISP.

  3. Uplink Bounces

    Ugh. This one is annoying. Every 12 hours, to the minute, my internet connection will bounce. Looking at dmesg on the router shows ARP entries for my gateway being overwritten to one of two MAC addresses. I looked up the MACs and they belong to equipment from Casa Systems. They make cable network products. I assume this is a active/standby setup for my cable loop and for some reason it keeps bouncing between them. They could be failing/crashing, it could have something to do with DOCSIS, or it could be the NSA. Your guess is as good as mine. This is on my list to investigate soon.

    Update, July 17: I've found that the "reset" interval is exactly 8 hours and lasts for almost exactly 1 minute. I've also found that the trigger is an ARP REQUEST for my subnet (but not my IP address). My best guess is that my ISP has a CMTS that load balances customers on the loop to one of two hardware devices or line cards (each has it's own MAC). Every 8 hours the ARP cache is refreshed. This amounts to both hardware devices sending a ARP REQUEST for each customer. However, the ARPs use the MAC of which device the customer is being load balanced to. So my router will see an ARP from the other hardware device and update its ARP cache for my gateway IP address. This temporarily redirects my traffic to the wrong device (which probably drops my traffic) until my router sees an ARP from the correct MAC which corrects the situation. Why would this be happening? I have no idea. It seems like a bad idea. Maybe my ISP just has a broken configuration. My current work around is to create a static ARP entry for my gateway. This causes the temporary ARP overwrite to be rejected, but it's not a real solution.

    Update, August 6: I'm an idiot. This turned out to be my own damn fault. I have a dhcpcd hook (See #4 on this list) that restarts NPF every time my uplink IP changes. This is to fix the DNAT rules. Unfortunately I wasn't correctly ignoring a DHCP RENEW event. So every 12 hours when my router tried to renew it's uplink address NPF would be restarted. The problem had nothing to do with the ARP noise I mentioned in the previous update - which by the way is still concerning.

  4. Uplink IP changes

    When my uplink/external IP address changes I have to restart NPF on the router. This is because the NAT rules need to know what IP address to use for the mapping. It's not a big deal. It just took some magic and scripts for dhcpcd.

Concerns

My biggest concern is hardware failure of the router. That would put my network out of commission for awhile. It would be even worse if it happened years from now as the Odroid may not be available anymore. The reality here is the current situation is no worse than it was before when I was using a Linksys router. I can always buy another SoC. I only have a few config files that I need to preserve from my router. So it's just a matter of installing NetBSD on a new SoC and copying them over from a backup.

Future Tasks

There are some things that I could do now that I have a highly configurable network.

Things Not Included in This Post

For the sake of brevity I deliberately left some information out of this post. However I'll list them here.

Next Post