The Open Fabric: October 2013

Historically when an application team needed compute and storage resources they would kick off a workflow that pulled in several teams to design, procure and deploy the required infrastructure (compute, storage & network). The whole process generally took a few months from request to delivery of the infrastructure.

The reason for this onerous approach was really that application groups generally dictated their choice of compute technology. Since most applications scaled vertically, the systems and storage scaled likewise. When the application needed more horsepower, it was addressed with bigger more powerful computers and faster storage technology. The hardware for the request was then staged followed by a less-than-optimal migration to the new hardware.

The subtlety that gets lost regarding server virtualization is that a virtualization cluster is based on [near] identical hardware. The first machines that were virtualized where the ones who’s computer and storage requirements could be met by the hardware that the cluster was based on. These tended to be the applications that were not vertically scaled. The business-critical vertically scaled applications continued to demand special treatment, driving the overall infrastructure deployment model used by the enterprise.

The data center of the past is littered with technology of varying kinds. In such an environment technology idiosyncrasies change faster than the ability to automate them -- hence the need for operators at consoles with a library of manuals. Yesterdays data center network was correspondingly built to cater to this technology consumption model. A significant part of the cost of procuring and operating the infrastructure was on account of this diversity. Obviously meaningful change would not be possible without addressing this fundamental problem.

Large web operators had to solve the issue of horizontal scale out several years ahead of the enterprise and essentially paved the way for the horizontal approach to application scaling. HPC had been using the scale out model before web, but the platform technology was not consumable by the common enterprise. As enterprises began to leverage web-driven technology as the platform for their applications they gained it’s side benefits, one of which is horizontal scale out.

With the ability to scale horizontally it was now possible to break an application into smaller pieces that could run across smaller “commodity” compute hardware. Along with this came the ability to build homogeneous easily scaled compute pools that could meet the growth needs of horizontally scaling applications simply by adding more nodes to the pool. The infrastructure delivery model shifted from reactive application-driven custom dedicated infrastructure to a proactive capacity-driven infrastructure-pool model. In the latter model, capacity is added to the pool when it runs low. Applications are entitled to pool resources based on a “purchased” quota.

When homogeneity is driven into the infrastructure, it became possible to build out the physical infrastructure in groups of units. Many companies are now consuming prefabricated racks with computers that are prewired to a top-of-rack switch, and even pre-fabricated containers. When the prefabricated rack arrives, it is taken to it’s designated spot on the computer room floor and power and network uplinks are connected. In some cases the rack brings itself online within minutes with the help of a provisioning station.

As applications transitioned to horizontal scaling models and physical infrastructure could be built out in large homogeneous pools some problems remained. In a perfect world, applications would be inherently secure and be deployed to compute nodes based on availability of cpu and memory without the need for virtualization of any kind. In this world, the network and server would be very simple. The reality is, on the one side, that application dependencies on shared libraries do not allow them to co-exist with an application that needs a different version of those libraries. This among other things forces the need for server virtualization. On the other side, since today’s applications are not inherently secure, they depend on the network to create virtual sandboxes and enforce rules within and between these sandboxes. Hence the need for network virtualization.

Although server and network virtualization have the spotlight, the real revolution in the data center is simple homogeneous easily scalable physical resource pools and applications that can use them to effectively. Let's not lose sight of that.

[Improvements in platform software will secure applications and allow them co-exist on the same logical machine within logical containers, significantly reducing the need for virtualization technologies in many environments. This is already happening.]

In 2004 we were in the planning phase of building a new data center to replace one we had outgrown. The challenge was to build a network that continued to cater to a diverse range of data center applications and yet deliver significantly improved value.

Each operational domain tends to have one or more optimization problem whose solution is less than optimal for another domain. In an environment where compute and storage equipment come in varying shapes and capabilities and with varying power and cooling demands, the data center space optimization problem does not line up with the power distribution and cooling problem, the switch and storage utilization problem, or the need to minimize shared risk for an application, to name a few.

The reality of the time was that the application, backed by it's business counterparts, generally had the last word -- good or bad. If an application group felt they needed a server that was as large as a commercial refrigerator and emitted enough heat to keep a small town warm, that's what they got, if they could produce the dollars for it. Application software and hardware infrastructure as a whole was the bastard of a hundred independent self-proclaimed project managers and in the end someone else paid the piper.

When it came to moving applications into the new data center, the first ask of the network was to allow application servers to retain their IP address. Eventually most applications moved away from a dependence on static IP addresses, but many continued to depend on middle boxes to manage all aspects of access control and security (among other things). The growing need for security-related middle-boxes combined with their operational model and costs continued to put pressure on the data center network to provide complex bandaids.

A solid software infrastructure layer (aka PaaS) addresses most of the problems that firewalls, load-balancers and stretchy VLANs are used for, but this was unrealistic for most shops in 2005. Stretchy VLANs were needed to make it easier on adjacent operational domains -- space, power, cooling, security, and a thousand storage and software idiosyncrasies. And there was little alternative than to deliver stretchy VLANs using a fragile toolkit. With much of structured cabling and chassis switches giving way to data center grade pizza box switches, STP was coming undone. [Funnily the conversation about making better software infrastructure continues to be overshadowed by a continued conversation about stretchy VLANs.]

Around 2007 the network merchants who gave us spanning tree came around again pandering various flavors of TRILL and lossless Ethernet. We ended up on this evolutionary dead end for mainly misguided reasons. In my opinion, it was an unfortunate misstep that set the clock back on real progress. I have much to say on this topic but I might go off the deep end if I start.

Prior to taking on the additional responsibility to develop our DC core networks, I was responsible for the development of our global WAN where we had had a great deal of success building scalable multi-service multi-tenant networks. The toolkit to build amazingly scalable, interoperable multi-vendor multi-tenancy already existed -- it did not need to be reinvented. So between 2005 and 2007 I sought out technology leaders from our primary network vendors, Juniper and Cisco, to see if they would be open to supporting an effort to create a routed Ethernet solution suitable for the data center based on the same framework as BGP/MPLS-based IPVPNs. I made no progress.

It was around 2007 when Pradeep stopped in to share his vision for what became Juniper's QFabric. I shared with him my own vision for the data center -- to make the DC network a more natural extension of the WAN and based on common toolkit. Pradeep connected me to Quaizar Vohra and ultimately to Rahul Agrawal. Rahul and I discussed the requirements for a routed Ethernet for the data center based on MP-BGP and out of these conversations MAC-VPN was born. At about the same time Ali Sajassi at Cisco was exploring routed VPLS to address hard-to-solve problems with flood-and-learn VPLS, such as multi-active multi-homing. With pressure from yours truly to make MAC-VPN a collaborative industry effort, Juniper reached out to Cisco in 2010 and the union of MAC-VPN and R-VPLS produced EVPN, a truly flexible and scalable foundation for Ethernet-based network virtualization for both data center and WAN. EVPN evolved farther with the contributions from great folks at Alcatel, Nuage, Ericsson, Verizon, Huawei, AT&T and others.

EVPN and a few key enhancement drafts (such as draft-sd-l2vpn-evpn-overlay, draft-sajassi-l2vpn-evpn-inter-subnet-forwarding, draft-rabadan-l2vpn-evpn-prefix-advertisement) combine to form a powerful, open and simple solution for network virtualization in the data center. With the support added for VXLAN, EVPN builds on the momentum of VXLAN. IP-based transport tunnels solve a number of usability issues for data center operators including the ability to operate transparently over the top of a service provider network and optimizations such as multi-homing with "local bias" forwarding. The other enhancement drafts describe how EVPN can be used to natively and efficiently support distributed inter-subnet routing and service chaining, etc.

In the context of SDN we speak of "network applications" that work on top of the controller to implement a network service. EVPN is a distributed network application, that works on top of MP-BGP. EVPN procedures are open and can be implemented by anyone with the benefit of interoperation with other compliant EVPN implementations (think federation). EVPN can also co-exist synergistically with other MP-BGP based network applications such as IPVPN, NG-MVPN and others. A few major vendors already have data center virtualization solutions that leverage EVPN.

I hope to produce a blog entry or so to describe the essential parts of EVPN that make it a powerful foundation for data center network virtualization. Stay tuned.

The Open Fabric

Tuesday, October 22, 2013

The real Slim Shady

Sunday, October 13, 2013

The bumpy road to EVPN