Sunday, October 13, 2013

The bumpy road to E-VPN

In 2004 we were in the planning phase of building a new data center to replace one we had outgrown.   The challenge was to build a network that continued to cater to a diverse range of data center applications and yet deliver significantly improved value.

Each operational domain tends to have one or more optimization problem whose solution is less than optimal for another domain.  In an environment where compute and storage equipment come in varying shapes and capabilities and with varying power and cooling demands, the data center space optimization problem does not line up with the power distribution and cooling problem, the switch and storage utilization problem, or the need to minimize shared risk for an application, to name a few.

The reality of the time was that the application, backed by it's business counterparts, generally had the last word -- good or bad.  If an application group felt they needed a server that was as large as a commercial refrigerator and emitted enough heat to keep a small town warm, that's what they got, if they could produce the dollars for it.  Application software and hardware infrastructure as a whole was the bastard of a hundred independent self-proclaimed project managers and in the end someone else paid the piper.

When it came to moving applications into the new data center, the first ask of the network was to allow application servers to retain their IP address.  Eventually most applications moved away from a dependence on static IP addresses, but many continued to depend on middle boxes to manage all aspects of access control and security (among other things).  The growing need for security-related middle-boxes combined with their operational model and costs continued to put pressure on the data center network to provide complex bandaids.

A solid software infrastructure layer (aka PaaS) addresses most of the problems that firewalls, load-balancers and stretchy VLANs are used for, but this was unrealistic for most shops in 2005.  Stretchy VLANs were needed to make it easier on adjacent operational domains -- space, power, cooling, security, and a thousand storage and software idiosyncrasies.  And there was little alternative than to deliver stretchy VLANs using a fragile toolkit.  With much of structured cabling and chassis switches giving way to data center grade pizza box switches, STP was coming undone.  [Funnily the conversation about making better software infrastructure continues to be overshadowed by a continued conversation about stretchy VLANs.]

Around 2007 the network merchants who gave us spanning tree came around again pandering various flavors of TRILL and lossless Ethernet.  We ended up on this evolutionary dead end for mainly misguided reasons.  In my opinion, it was an unfortunate misstep that set the clock back on real progress.  I have much to say on this topic but I might go off the deep end if I start.

Prior to taking on the additional responsibility to develop our DC core networks, I was responsible for the development of our global WAN where we had had a great deal of success building scalable multi-service multi-tenant networks.  The toolkit to build amazingly scalable, interoperable multi-vendor multi-tenancy already existed -- it did not need to be reinvented.  So between 2005 and 2007 I sought out technology leaders from our primary network vendors, Juniper and Cisco, to see if they would be open to supporting an effort to create a routed Ethernet solution suitable for the data center based on the same framework as BGP/MPLS-based IPVPNs.  I made no progress.

It was around 2007 when Pradeep stopped in to share his vision for what became Juniper's QFabric.  I shared with him my own vision for the data center -- to make the DC network a more natural extension of the WAN and based on common toolkit.  Pradeep connected me to Quaizar Vohra and ultimately to Rahul Agrawal.  Rahul and I discussed the requirements for a routed Ethernet for the data center based on MP-BGP and out of these conversations MAC-VPN was born.  At about the same time Ali Sajassi at Cisco was exploring routed VPLS to address hard-to-solve problems with flood-and-learn VPLS, such as multi-active multi-homing.  With pressure from yours truly to make MAC-VPN a collaborative industry effort, Juniper reached out to Cisco in 2010 and the union of MAC-VPN and R-VPLS produced E-VPN, a truly flexible and scalable foundation for Ethernet-based network virtualization for both data center and WAN.  E-VPN evolved farther with the contributions from great folks at Alcatel, Nuage, Verizon, AT&T, Huawei and others.

E-VPN and a few key enhancement drafts (such as draft-sd-l2vpn-evpn-overlaydraft-sajassi-l2vpn-evpn-inter-subnet-forwarding, draft-rabadan-l2vpn-evpn-prefix-advertisement) combine to form a powerful, open and simple solution for network virtualization in the data center.  With the support added for VXLAN, E-VPN builds on the momentum of VXLAN.  IP-based transport tunnels solve a number of usability issues for data center operators including the ability to operate transparently over the top of a service provider network and optimizations such as multi-homing with "local bias" forwarding.  The other enhancement drafts describe how E-VPN can be used to natively and efficiently support distributed inter-subnet routing and service chaining, etc.

In the context of SDN we speak of "network applications" that work on top of the controller to implement a network service.  E-VPN is a distributed network application, that works on top of MP-BGP.  E-VPN procedures are open and can be implemented by anyone with the benefit of interoperation with other compliant E-VPN implementations (think federation).  E-VPN can also co-exist synergistically with other MP-BGP based network applications such as IPVPN, NG-MVPN and others.  A few major vendors already have data center virtualization solutions that leverage E-VPN.

I hope to produce a blog entry or so to describe the essential parts of E-VPN that make it a powerful foundation for data center network virtualization.  Stay tuned.


  1. Awesome post Aldrin! I landed late on EVPN... so I was missing the story behind.

    1. Thanks Jorge. E-VPN is seen by many as an Ethernet VPN application for service providers. Most commonly you see it described as a DCI application. What very few people know is that it was actually conceived for data center network virtualization. The IETF had no working group for scalable DC network virtualization (it did have a WG specific to TRILL), so because E-VPN was also applicable to the goals of the L2VPN WG that was the path chosen for E-VPN standardization. Interestingly it is the DC network virtualization requirements that E-VPN was designed to satisfy that happens to be quite good for service providers, not so much the other way around.

  2. Aldrin - when can we get EVPN in TOR hardware, who do I have to harass for this? :)

    1. Hi Kris, the short answer is yes, but will be a bit longer for full-featured and interoperating implementations. Hopefully more will be known over the next few months. Just so you know, in the first draft, an E-VPN PE was called an MES (MPLS edge switch) to avoid tying an E-VPN PE to the general PE mold (big expensive SP box). I expected from the start that the E-VPN PE inside the DC would be a ToR or virtual switch. This is where the DC was headed and where STP/flood-and-learn were giving up. Doing this was not an issue because RT-constrain would be used to send only relevant routes to PE. A ToR only needs the routes belonging to VPNs that the connected endpoints are members of. Additionally even back then ToRs were trending towards more CPU, memory, FIB capacity, buffering, etc and becoming more DC grade.

  3. In 2011, I built a Juniper VPLS network with VRRP to setup VMotions between a couple of data centers that were connected with some dark fiber. The distance was less than 2 miles and there was almost no latency. The solution is still in place and works fairly well for what it was intended. It's not ideal for a lot of the reasons - but it works. I would love to see how you'd mock up an EVPN solution with JUNOS!