The Open Fabric: 2015

Friday, May 15, 2015

Better than best effort — Rise of the IXP

This is the second installation in my Better than Best Effort series. In my last post I talked about how the incentive models on the Internet keep ISPs from managing peering and backbone capacity in a way that supports reliable communication in the face of the ever growing volume of rich media content. If you haven't done so, please read that post before you read this one.

It's clear that using an ISP for business communication comes with the perils associated to the "noisy neighbor" ebb and flow of consumer related high volume data movement. Riding pipes that are intentionally run hot to keep costs down is a business model that works for ISP, but not for business users of the Internet. Even with business Internet service, customers may get a better grade of service within a portion of an ISP's networks, but not when their data needs to traverse another ISP which they are not a customer of. There is no consistent experience, for anyone.

However there is an evolving solution to avoid getting caught in the never ending battle between ISP and large consumer content. As the title of this blog gives away, the solution is called an Internet eXchange Point (IXP).

IXPs are where the different networks that make up the Internet most often come together. Without IXP, the Internet would be separate islands of networks. IXP are, in a sense, the glue of the Internet. From the early days of the Internet, IXP have been used to simplify connectivity between ISPs resulting in significant cost savings for them. Without an IXP, an ISP would need to run separate lines and dedicate network ports for each peer ISP with whom it connects.

However IXP and ISP are not distant relatives. They are in fact close cousins. Here's why.

Both ISP and IXP share two fundamental properties. The first is that they both have a fabric, and the second is that they both have "access" links used to connect to customers so they can intercommunicate over this fabric. The distinction between the two is in the nature of the access interfaces and the fabrics. ISP fabrics are intended to reach customers that are spread out over a wide geographic area. An IXP fabric on the other hand is fully housed within a single data center facility. In some cases an IXP fabric is simply a single Ethernet switch. ISP access links use technology needed to span across neighborhoods, while IXP access links are basically ordinary Ethernet cables that run an average of around several dozen meters. So essentially the distinction between the two is that an ISP is a WAN and an IXP is a LAN.

The bulk of the cost in a WAN is in the laying and maintenance of the wires over geographically long distances. Correspondingly the technology used at the ends of those wires is chosen based on the ability to wring out as much value out of those wires as possible. The cost of a WAN is significantly higher than a LAN with a comparable number of access links. ISP need to carefully manage costs which are much higher per byte. It is on account of the tradeoffs that ISP make in order to manage these costs that the Internet is often unpredictably unreliable.

So how can IXP help?

Let's assume that most business begin to use IXP as meet-me points. Remember that the cost dynamics of operating an IXP are different than an ISP. At each IXP these business customers can peer with one another and their favorite ISPs for the cost of a LAN access link.

There are at least a few advantages of this over connecting to an ISP. Firstly, two business entities communicating via an IXP's [non-blocking] LAN are effectively not sharing the capacity between them with entities that are not communicating with either of them, and so making their experience far more predictable. This opens the door for them to save capital and operational costs by eliminating the private lines that they may currently have with these other business entities. Secondly, a business that is experiencing congestion to remote endpoints via an ISP can choose to not use it by [effectively] disabling BGP routing with that ISP. This is different from the standard access model used by businesses in that if there are problems downstream of one of their ISP, their control is limited to the much fewer ISP from whom they have purchased access capacity.

The following illustrates a scenario where there is congestion at a peering point downstream of one of the ISP being used by a business that is affecting it's ability to reach other offices, partners or customers that are hosted on other ISP.

In the access model, since BGP cannot signal path quality, traffic is blindly steered over a path that has the shortest number of intermediate networks versus a path with the best performance. Buying extra access circuits alone to avoid Internet congestion is not a winnable game (more on this in the next part of this series).

The alternative approach using an IXP would look something like the following.

This illustration shows how being at an IXP creates more direct access to more endpoints at a better price point than buying access lines to numerous ISP. You can also see how peering with other business entities locally at an IXP can improve reliability, reduce costs and simplify business-to-business connectivity by combining it with Internet connectivity.

There is an interesting trend occurring within the growing number of managed co-location data centers. Hosted within many of these co-location data centers are IXP. Some managed data center operators like Equinix even operate their own IXP at their data centers. These data centers are an ideal place for businesses to connect with one another through IXP without the downsides that come with using consumer-focused ISP.

This is not to say that the operational capabilities at all IXP are at a level needed to support large numbers of businesses. There is work to be done to scale peering in a manner that will give customers minimal configuration burden and maximal control.

There will even be need for business-focused ISPs that connect business customers at one IXP to business customers connected to the Internet at another IXP. Although net neutrality prohibits the differentiated treatment of data over the Internet, it does not forbid an ISP or IXP from selecting the class of customer it chooses to serve. This is much like the difference between a freeway and a parkway. Parkways do not serve commercial traffic and so in a way they offer a differentiated service to non commercial traffic.

As the Internet enables new SaaS and IaaS providers to find success by avoiding the high entrance cost of building a private service delivery network, more businesses are turning to the Internet to access their solution providers of choice. The old Internet connectivity model cannot reliably support the growing use of the Internet for business and so a better connectivity model is needed for a reliable Internet. New opportunities await.

In upcoming posts I will discuss additional thoughts on further improving the reliability of communicating over the Internet.

Wednesday, May 13, 2015

Better than best effort — Reliability and the Internet.

Metcalfe’s law states that the value of a telecommunications network is proportional to the square of the number of connected users of the system.

Networks prior to the Internet were largely closed systems, and the cost of communicating was extraordinarily high. In those days, the free exchange of ideas at all levels was held back by cost. On the Internet, for a cost proportional to a desired amount of access bandwidth, one can communicate with a billion others. This has propelled human achievement forward over the last 20 years. By way of Metcalfe’s law, the Internet’s value is immeasurably larger than any private network ever will be.

So why do large private service delivery networks still exist?

The answer lies primarily in one word: reliability. What Metcalfe’s law doesn’t cover is the reliability of communication of connected users, and the implications of a lack of reliability on the value of services delivered. Although Internet reliability is improving, much like the highway system, it still faces certain challenges inherent with open and unbiased systems.

On a well run private network, bandwidth and communications are regulated to deliver an optimal experience, and network issues are addressed more rapidly as all components are managed by a single operator. The Internet on the other hand is a network of networks wherein network operators do not have sufficient incentive to transport data for which they are not adequately compensated.

Growing high volume content services such as video streaming place unrelenting strain on the backbones and peering interfaces of Internet service providers. With network neutrality and the corresponding lack of QoS on the Internet, ISPs have to maintain significant backbone and peering capacity to ensure other communication continue to function in the presence of these high volume traffic. However ISP operators have demonstrated that they are much more inclined to provide capacity to their direct customers than they are to those who are not on their network.

Some large ISPs refuse to better manage their peering capacity yet they host a large number of end users. These end users are, in a qualitative sense, locked inside their ISP. This seems to be forcing large web and video content providers to buy capacity directly from the ISP of their mutual users in order to deliver content to them. Despite this latter trend, peering (and many backbone) links continue to be challenged.

(Note: With some ISP there is a qualitative difference between business Internet service and consumer Internet service when it comes to backbone and peering capacity)

For private enterprises that want to engage in business-to-business communications over the Internet, these “noisy neighbor” dynamics do not lend well to reliable service delivery and/or to cost management. It is cost prohibitive to buy Internet access from many ISPs for the purpose of B2B communications and, conversely, buying access from only a few ISPs puts the communication between two business entities on different ISP at risk of being routed via a congested peering link. Unfortunately BGP does not take path quality into account when choosing them.

On the outside, it seems straight-forward enough for businesses to continue to peer directly over private leased lines. However there is a trend that is putting pressure on this model. An increasing number of private businesses are leveraging an emerging landscape of SaaS and IaaS services. This is driving a general acceptance of the Internet as a primary medium for B2B communication. Private connectivity also comes with the baggage of added capital and operational costs.

For many businesses, Internet-based B2B communication “as is” may be fine for SaaS services such as HR and billing where a temporary loss of service is survivable. But there are a class of services and communications that are too-important-to-fail for many businesses and even for larger ecosystem such as capital markets. Reliable infrastructure is a prerequirement for engaging in these services.

The prevalent Internet access models fail to bring the Internet to a consistent level of reliability needed by many businesses. The support of B2B communication over the Internet needs to improve as more businesses adopt the Internet for their core business communication needs. Needless to say, I have my thoughts on how this should happen which I share on my next blog.

(Note: I have intentionally avoided DDoS and security related problems that also come with being on the Internet. I believe these can be better handled once the more fundamental problem with the plumbing is dealt with.)

Thursday, March 5, 2015

EVPN. The Essential Parts.

In a blog post back in October 2013 I said I would write about the essential parts of EVPN that make it a powerful foundation for data center network virtualization. Well just when you thought I'd fallen off the map, I'm back. :)

After several years as an Internet draft, EVPN has finally emerged as RFC7432. To celebrate this occasion I created a presentation, EVPN - The Essential Parts. I hope that shedding more light on EVPN's internals will help make the decision to use (or to not use) EVPN easier for operators. If you are familiar with IEEE 802.1D (Bridging), IEEE 802.1Q (VLAN), IEEE 802.3ad (LAG), IETF RFC4364 (IPVPN) and, to some degree, IETF RFC6513 (NGMVPN) then understanding EVPN will come naturally to you.

Use cases are intentionally left out of this presentation as I prefer the reader to creatively consider whether their own use cases can be supported with the basic features that I describe. The presentation also assumes that the reader has a decent understanding of overlay tunneling (MPLS, VXLAN, etc) since the use of tunneling for overlay network virtualization is not unique to EVPN.

Let me know your thoughts below and I will try to expand/improve/correct this presentation or create other presentations to address them. You can also find me on Twitter at @aldrin_isaac.

Here is the presentation again => EVPN - The Essential Parts

Update: I found this excellent presentation on EVPN by Alastair Johnson that is a must read. I now have powerpoint envy :)