The State of the Service Mesh, Part 2: Availability

In the world of microservices, the pace of change is unyielding but the excitement is surely building, as a critical mass of industry thought leaders and practitioners are moving beyond mere theory and talk. Early‑adopter organizations that have workloads requiring service mesh functionality now want to prove its viability as a production‑ready architecture by implementing actual solutions for some compelling “low hanging fruit” use cases.

This post is the second installment in our series on service mesh:

In Part 1, we summarize key developments that have taken place in the service mesh space in the past year or so, and enumerate a number of the key application requirements for a service mesh.
In a related blog, my colleague Owen Garrett provides guidance on determining whether you really need a service mesh and on how to use proven technologies that are available today until service mesh implementations are more mature.

In this post, we return to highlight one of the service mesh capabilities that we identified as core to the service mesh value proposition: availability. In Part 1, we put a spotlight on how I&O and DevOps leaders are responsible for deploying mission‑critical apps with a delivery infrastructure, including service meshes, that delivers fault tolerance. The control plane is where most of the innovation is happening, as the data‑plane infrastructure available with tools like NGINX and Envoy is already enterprise‑grade. The exciting news is that vendors have been developing their control planes quickly and have largely addressed some of the early concerns about the control plane as a potential single point of failure.

Following is a roundup of some of the recent innovations and developments for highly available control plane at the heart of a maturing, commercially supported service mesh:

The Consul service mesh solution from HashiCorp provides a good model for providing high availability: it distributes data‑plane and control‑plane functionality across all member nodes, using the Consul agent to enforce policies across all nodes in the mesh.
Linkerd 2.3 is the latest version of the incubator project from the Cloud Native Computing Foundation. It offers a control‑plane solution on Kubernetes that provides for multiple load‑balanced pods with an easy-to-understand dashboard. Linkerd has now reached a point of maturity where it has a decent UI combined with the benefits of a very lightweight Rust‑based data plane, which sets this solution apart from others that typically rely on NGINX or Envoy.
At Google Cloud Next in April, Google announced and highlighted the beta release of Istio on GKE as one of three core components of Anthos (formerly Google Cloud Services Platform). Google continues to develop and promote Istio, which now offers “connect” features and a control plane called Mixer. Google highlights availability as a primary control‑plane feature, stating “Mixer is designed to deliver high availability for each individual Mixer instance. Its local caches and buffers reduce latency but also help mask infrastructure backend failures[,] operating even when a backend has become unresponsive”.
New startup Tetrate has emerged from stealth mode with US $12.5M in funding and is demonstrating how Istio can be simplified and packaged for easier deployment and operations within the enterprise across on‑premises as well as cloud infrastructures with multiple vendors and regions. Tetrate CEO Varun Talwar was previously a product manager at Google’s Cloud platform team responsible for developing the Istio project. From that experience he knows that “Istio works with Kubernetes today, but enterprise customers have a lot of legacy workloads. Our first offering will help provide secure seamless connectivity between workloads and help companies move towards the transition to containers and public cloud”.
F5’s recent acquisition of NGINX will accelerate the availability of an easy-to-use, flexible offering based on the NGINX Application Platform and NGINX Controller. We’re planning to release a service mesh control plane later this year as the NGINX Controller Service Mesh Module. Not only will it enable high availability, it will offer users a common GUI based on the existing Advanced Load Balancing Module and API Management Module.

One thing for certain is that the speed of change in the service mesh space is not slowing down. Vendors are racing to develop highly available service mesh control planes and although several models are now available, no single solution yet emerged as dominant. We live in a dynamic and exciting world of microservices. Watch for the next post on the topic of how security is addressed in the service mesh space. Until then, enjoy the ride.

The post The State of the Service Mesh, Part 2: Availability appeared first on NGINX.

Source: The State of the Service Mesh, Part 2: Availability