Archive for February, 2009

Link Distance Flexibility

Tuesday, February 24th, 2009

When deploying an outdoor wireless network, a choice is usually made between building a short-range mesh or a long-range PtMP/PtP system. A short-range mesh is normally used for downtowns, “hot zones” and campuses, and provides all the benefits normally attributed to meshing, such as fault tolerance due to re-routing and fast, easy installation with little need for link engineering due to the large amount of peers available. But the problems include:

  • shorter links
  • the need for more wired or wireless backhaul
  • unpredictable service due to the large interference domain

To provide longer range communications, a PtMP or PtP system can be used. This is normally used for applications such as fixed wireless access to homes or businesses, and smart grid backhaul, especially in medium density to rural areas. But problems with this model include:

  • a lack of redundancy due to each client normally seeing only a single base station
  • many base stations are required due to single hop
  • the need to engineer each link
  • incomplete coverage (each client must have a direct path to a base station, so some installations may be completely obstructed, and the network must be built very densely to minimize this)

To address some of these issues with each type of system, an architecture started to emerge a few years ago which combined PtMP backhaul along with omni-directional mesh. However, not only does this require two different solutions and sets of equipment, but the many of the issues are inherited from each type of system. For instance, while a subscriber connecting to the short-range mesh may benefit from the many mesh nodes available to choose from, there is still a need to engineer the backhaul links and there are still issues around interference with the short-range mesh. The PtMP system would still need to be built very densely to provide sufficient coverage and in order to compensate for the single hop links, and there still may be coverage holes due to obstructions. Also, the PtMP system lacks redundancy. And although it may be possible to use multiple meshes to heal around back-haul outages, this requires complex dynamic routing to be run between the backhaul network and the short-range mesh, and requires multiple adjacent short-range meshes which may not be present in situations where the meshes are islands within a larger sparse network.

One of the reasons that we chose to implement dynamic antenna pointing was to address both of these network architecture issues by providing a single system that can do both long-range backhaul and short-range meshing. In fact, while our first internal testbed ran over 7 hops that ranged from 10 yards to 300 yards, our first customer deployment connected mountain tops across 20 mile links. Over the shorter links the dynamically switched antennas have the isolation needed to avoid interference and to provide spectral reuse, while over longer links the antennas provide the gain needed to close the links at a decent modulation. These very different deployments use the same hardware, same protocol and exactly the same configuration – the only difference is the deployment locations.

Below are snapshots of two live deployments, one mostly PtMP (with one SkyExtender relay) and one dense mesh. The PtMP system has a mixture of links, from short to several miles, while the dense mesh has links of mostly under 100 yards. These systems are running equivalent hardware (although DualBands are used in the dense mesh), the same software, with the same basic configuration.

ptmp

sc

In some rural areas there is even a hybrid model that some customers use with the SkyPilot equipment where pockets of dense subscribers, connected to each other using short links, are interconnected using long distance links. For example, there are areas of rural Germany where a single SkyGateway connects over long links to SkyExtenders in different villages, which then mesh over shorter links with other SkyExtenders and SkyConnectors within the villages.

Ethernet Vs IP At The Edge

Sunday, February 22nd, 2009

In every realm of networking, from backbone transport, to enterprise LAN, to access networks, to even data centers, there are debates about the use of layer 2 (Ethernet) versus layer 3 (IP) transport. The proponents of layer 2 argue that it’s inexpensive, efficient, and supports non-IP protocols while the proponents of layer 3 argue that it’s more secure and scalable than layer 2. There are obviously different answers for different networks, but having personally developed both IP and Ethernet systems for military wireless mesh, fixed wireless access and WiFi clouds, I believe that in the case of last-mile wireless access the benefits of layer 2 (Ethernet) far outweigh the problems that need to be addressed. In order to compare the pros and cons of each transport technology, let’s look at the issues with each since the benefits of one technology are often the converse of the issues with the other.

Issues with layer 3 (IP):

  • Only IP is transported: no AppleTalk, PPPoE, broadcast device discovery, or legacy Ethernet devices such as serial/Ethernet telemetry…
  • No virtual LAN services: offering layer 2 pipes for virtual LAN services has become an extremely important offering for many service providers. IP networks do not inherently support this service, and additional equipment or protocols need to be layered on top in order to support it.
  • Many layer 3 systems only support IPv4: IPv6 is very different from IPv4, so support needs to be explicitly added, and IPv6 is critical for large networks such as smart grids.
  • IP demarcation issues: the interface from the wireless access equipment must support whatever dynamic routing protocol the operator has chosen (RIP, OSPF, BGP-4, …). And, the operator may need to run a dynamic IP routing protocol in order to support client mobility (while an Ethernet system would allow learning switches to interconnect gateways for fast, transparent roaming.)
  • IP multicast support (for some types of video streaming) needs to be explicitly supported. IP Multicast forwarding is very different than regular IP forwarding, and involves different protocols.
  • Slower re-routing: compared to Ethernet table learning, IP dynamic routing is slow.

Issues with layer 2 (Ethernet):

  • Scalability limitations due to a large broadcast domain and Ethernet learning table size restrictions of external switches.
  • Inter-subscriber security concerns due to layer 2 attacks directly between subscribers (ARP poisoning, rogue DHCP servers, …).
  • Subscriber-to-network security concerns from Ethernet MAC address spoofing and ARP poisoning.

Since layer 3 is a higher layer protocol than layer 2, it seems to become a question of limitations versus problems. Is it better to live with the limitations of an IP transport or with the problems of an Ethernet transport?

To deal with the issue of MAC address scalability, fortunately switch learning tables have greatly increased in size. And even if an Ethernet learning table overflows, the standard behavior is to replace the oldest entry, which is often from an inactive device. And data is still forwarded in any case, so the total number of devices supported on a network is much larger than the size of the switches’ Ethernet learning tables.

And to deal with both the large broadcast domain issue and lack of security between subscribers due to potential layer 2 attacks, many switches have a feature called “protected ports” (which SkyPilot has implemented as “Peer to Peer Control”). This feature can selectively block layer 2 forwarding between ports and VLANs of an Ethernet switch, or between subscribers within a virtual LAN within the SkyPilot system, in cases where the users of those ports or VLANs are not from the same administrative domain (for example, not employees of the same company). And since this control can be done on a VLAN basis, an operator can use this control to provide some groups of subscribers direct layer 2 access while limiting the layer 2 access of other users, such as home Internet subscribers, to only the router that leads to the Internet.

And even if users of different protected ports or VLANs need to communicate at layer 3 (for some cases of VoIP, gaming, file sharing, …) then several simple methods are available to allow that communication at layer 3 or above, such as the “local proxy ARP” feature of most routers or /30 IP subnetting at the subscriber level.

So with the control of protected ports and VLANs in both the wireless system and any external network switches, the potential of attacks between subscribers (such as ARP poisoning and rogue DHCP servers) can be completely avoided, and the only attacks left are attacks directly from subscribers to the network (such as to the first hop router). These layer 2 attacks fall into two specific cases: MAC spoofing and ARP poisoning. In both of these attacks one user intentionally mimics the Ethernet MAC address of another user, which causes a temporary Denial of Service (DoS). These can not effectively be used as data intercept attacks, so data is not compromised. And the denials of service are extremely short, especially in the case of MAC address spoofing where the attack only lasts until the real user sends a single Ethernet frame. And since the attacker is easily identified and their access can simply disabled, these attacks are not actually very common, and are ineffective.

And an alternative or supplementary tool that an operator can use to address many of these issues is filtering. SkyPilot devices support filters that range from the Ethernet MAC layer up to the IP port level. For example, instead of (or in addition to) disabling peer to peer communication using protected ports, an operator can simply configure UDP port filters to prevent rogue DHCP servers.

So by having addressed these Ethernet scalability and security concerns, the edge network can take advantage of the benefits of an Ethernet transport, including:

  • Simple IP address management: IP addresses can be handed out in a number of ways (DHCP, static, PPPoE, …) and they can be assigned independently of the point of attachment.
  • Support for any Ethernet device, such as IPv4, IPv6, IP multicast, NetBIOS, AppleTalk,  and legacy Ethernet devices.
  • Seamless, fast intra-network mobility.
  • Virtual LAN services (private LANs can be configured across the network).
  • Simple layer 2 demarcation at the base-station (no IP routing protocol requirements).

An important aspect of Ethernet is that using it as a transport method does not mean a lack of IP services. IPv4, IPv6 and virtually every layer 3 protocol has an Ethernet convergence function, so if a device talks Ethernet then it can run over an Ethernet transport system without any special support from the network devices. And, even if a device such as a wireless mesh node provides Ethernet transport, it can also include an IP stack for its own communication, such as remote management. And IP-aware filters can be added to devices that are providing only an Ethernet transport service. So, Ethernet transport does not mean “no IP”.

Point to MultiPoint Vs Dynamic Antenna Switching

Friday, February 20th, 2009

Point-to-MultiPoint (PtMP) systems typically require multiple frequencies in order to avoid self-interference (interference among base-stations within the same network, or among sectors of a single base-station). The degree that multiple frequencies are re-used within the network is called “frequency re-use”, and is quantified by a frequency re-use factor. The frequency re-use factor will vary based on the number of available frequencies, the deployed technology and the network architecture.  The network architecture generally falls into two categories: omni-directional systems and sectorized systems.

For omni-directional PtMP systems, the frequency re-use factor is basically how often a frequency gets reused within the overall network, and re-use factors of 3, 4, 7, 9 and 12 are common. A frequency re-use factor of 4 means that 4 different frequencies are used, with base-stations that have adjacent coverage each operating on a different single frequency, and each frequency is reused on each 4th base-station.

freq-reuse

A problem with this type of network is that obviously many frequencies are required, which may not be possible in many limited frequency bands such as 3.65 GHz or 4.9 GHz. And since only a fraction of the total available bandwidth is used at each base-station, the capacity of each base-station is reduced. Additionally, since each base-station is only providing a single frequency, there is no frequency or base-station redundancy at the subscriber level, so interference or a failure of a base-station will cause a complete outage for any affected subscribers.

For sectorized base-stations, a frequency re-use factor of 3 is commonly used, and adjacent sectors do not use the same frequency. For instance, if a base-station has 3 sectors, each sector would be 120 degrees wide for 360 degree coverage and each sector would use a different frequency from a total of 3 frequencies. Problems with this architecture include:

  • Lack of frequency diversity at the subscriber: a subscriber physically resides in one primary sector (and frequency), so if that frequency is being interfered with by a different base-station (in a licensed band) or a different network (in an unlicensed or “lightly” licensed band) then the subscriber could lose service. And if directional antennas are used at the subscriber, which is almost always the case in order to increase the link gain, then redundancy is not even availabe from other base-station locations.
  • Lower antenna gain: the frequency re-use factor dictates the sector beam-width (antenna gain is directly related to beam-width), and, in order to get 360 degree coverage, wide antenna beam-widths are needed. And even if a single frequency were used multiple times on a single base-station (which usually requires some sort of coordination), such as in a F1,F2,F3,F1,F2,F3 pattern, each sector would still only be at most 60 degrees. In a dynamically switched antenna system, like SkyPilot’s, this constraint does not exist, and the antenna beam-width can be much smaller which results in higher antenna gain.
  • Multiple frequencies are needed: just like in the omni-directional case, in some bands there are a limited number of available frequencies (or frequencies are expensive in licensed bands).  And in unlicensed bands there may not be multiple clean channels. And if a single channel is sub-divided, which many systems do not even support, each sector would only have a fraction of the total bandwidth.

With SkyPilot’s dynamic antenna switching, 8 high-gain 45 degree sectors are shared using a single radio, so a single frequency can be provided with 360 degree coverage while still providing the benefits of a high-gain antenna. Even though SkyPilot provides the resiliency of a mesh networking architecture, this spectral reuse flexibility has allowed many service providers to deploy large PtMP deployments in which each base-station provides synchronous connectivity to  low-cost subscriber equipment.

freq-reuse-sect

In situations where multiple channels are available, an omni-directional PtMP system loses any extra capacity that could have been gained, due to the required frequency re-use. By simply using multiple base-stations with the SkyPilot system, all of the additional channel capacity can be provided at each base-station location. And, each channel is provided over 360 degrees, compared to sectorized PtMP architectures which only provide each frequency on particular sectors, so with the SkyPilot equipment, there is frequency and base-station equipment redundancy to each subscriber (even if the subscriber uses a high-gain directional antenna).

And, of course, there is the additional benefit of meshing for additional range, routing around obstructions, and increasing system capacity by relaying through shorter high-modulation links (instead of wasting base-station bandwidth by communicating to a long range subscriber at low modulation, a high-modulation relay can be used).  But, these benefits are all extra, since even in a pure PtMP environment there is significant benefit from dynamic antenna switching.

Mesh Capacity (Part 2): The Multi-Radio Myth

Wednesday, February 18th, 2009

When we were designing the SkyPilot multi-hop scheduling protocol, our task would have been much easier if we simply used one radio to talk to the parent node and another radio to talk to the child nodes. However, there are several reasons why we chose to tackle the much more difficult problem of single-radio multi-hop scheduling. Obvious reasons to use only a single radio include cost (radios might but cheap, but high power, industrial grade radios and the additional interconnect are not), power, size and the inability to find many clean channels…, but the main reason is that using multiple radios simply doesn’t work!

Focusing on that last claim, simultaneous transmissions and receptions over long-distance links simply do not work in the real world. This is based on physics. If a high-power radio is transmitting at +30dBm while another co-located radio is trying to receive a signal at -90dBm, then the +30dBm transmission will completely swamp the -90dBm reception. That’s a 120dB difference in signal levels, and to put that in perspective, the transmitted signal level is 1 trillion times stronger than the received signal level.

To combat this problem, multi-radio systems have traditionally tried using combinations of:

  • filtering (but filtering is expensive and can not provide anywhere near the needed 100dB+ of isolation)
  • physical separation of antennas (but using extremely long cables to externally mounted antennas is lossy, expensive, and doesn’t fit onto a pole – so it’s impractical to get the level of isolation required)
  • increasing received signal strength by only allowing very short links (but this is still insufficient, and would only allow for very short links)
  • lab demos (it’s common to see cabled multi-radio setups showing simultaneous active radios in the lab, but this is just smoke and mirrors where the cables are providing artificial path isolation and are allowing unrealistically high received signal levels)
  • lowering modulation (where the radios are allowed to interfere and simply drop modulation and rely on CSMA – but then there is no capacity benefit to using multiple radios)
  • requiring channel separation (but even skipping an entire channel is not enough, which you can see from radio vendors’ published “alternative channel rejection” values)
  • and (in theory), trying to schedule around tx/rx situations (but, this requires symmetric upstream and downstream traffic, and is impractical)

Even if different channels are used, a typical wireless specification like 802.11a only requires adjacent channel rejection of up to 16 dB and alternate channel rejection up to 32 dB. So even different channels don’t help, and different frequency bands are sometimes recommended (for instance, one channel at 5.2Ghz and the other at 5.8Ghz) which is impractical due to power restrictions and availability.

Even the combination of many of these techniques is nowhere near sufficient to allow for simultaneous transmissions and receptions in a single device over reasonable link distances. So, while a multi-radio story might sound extremely compelling, and has somehow even found its way into some RFP requirements, there are many hidden technical challenges that make it not feasible in the real world.

And there is another important factor in the single versus multiple radio debate – in an “access network”, where the majority of traffic is flowing to and from a gateway node, the bottleneck is almost always the gateway. So, in order to increase capacity of the overall mesh, multiple radios can simply be used at the gateway (which I’ll talk about how to most effectively do in regard to SkyPilot devices in another post). So, if you happen to have access to a live mesh “access network”, try monitoring the utilization of all of your non-gateway radios versus your gateway radios, and this will show you how much money you’ve stranded on your poles.

Mesh Capacity (Part 1)

Thursday, February 5th, 2009

There has been an ongoing discussion in the mesh community about how much capacity is lost due to the relaying of data within a wireless mesh network. Proponents of multi-radio architectures have argued that they can deliver close to 1/n (where n is the number of hops) of the capacity of a radio simultaneously to each mesh device, while single radio architectures are closer to 1/2^n. For instance, a 4-hop path in a multi-radio system (assuming several clean channels are available) could deliver on the order of 1/4 the capacity of a radio simultaneously to all mesh devices, while a single-radio system may only be able to deliver 1/2^4, or 1/16, the capacity of a radio, due to multi-hop interference.

This diagram shows how a traditional single radio mesh system has its bandwidth reduced due to a large interference domain allowing only a single device to transmit at a time (note: the circles show the communication range, while the interference range will usually have a radius many times larger).

Single Radio Mesh

A multi-radio system could use several frequencies to allow multiple transmissions to take place at the same, reducing some of these interference conditions (however, not only does this require multiple clean channels, but there are some pitfalls that will be analyzed in a future post).

So an obvious question is, “How does SkyPilot’s dynamic antenna switching affect system capacity?” The answer is that even though the SkyPilot system uses a single backhaul radio, it can still provide 1/n the channel capacity simultaneously to each device due to the dynamic antenna switching.

In addition to all of the previously discussed benefits of dynamic antenna switching, such as higher link budget, interference avoidance and point-to-point power levels, the largest benefit is probably from something called “spectral re-use”. Basically, spectral re-use is a benefit of using dynamically switched high-gain antennas where multiple transmissions can take place simultaneously, on the same frequency, in very close proximity.

For example, the dynamic point-to-point link formed by the high-gain antennas allows a first-hop transmission to not interfere with a third-hop reception, even on the same channel. And while one first-hop device is relaying, spectral re-use allows many other devices to simultaneously communicate, such as allowing the gateway to transmit to another first-hop device. That is why we always recommend at least 2 first-hop devices. This allows the gateway, and most other devices within the mesh, to be continuously active, so the capacity of the overall system is equal to the capacity of the gateway radio.  This allows at least 1/n to be delivered to each device simultaneously, equivalent to the multi-radio mesh system and much higher than traditional single radio systems.

Dynamically Switched Directional Antennas

And by only consuming a single channel, additional channels can be employed in order to multiply overall system capacity (plus, it is often difficult to find the multiple clean channels that multi-radio architectures require). But, the use of multiple radios in context of traditional mesh networks and the SkyPilot system will be explored in a future post.