Site icon 지락문화예술공작단

Using NGINX Plus for Backend Upgrades with Zero Downtime, Part 1 – Overview

Using NGINX Plus for Backend Upgrades with Zero Downtime, Part 1 – Overview

Upgrading backend servers in a production environment can be a challenge for your operations or DevOps team, whether they are dealing with an individual server or upgrading an application by moving to a new set of servers. Putting upstream servers behind NGINX Plus can make the upgrade process much more manageable while also eliminating or greatly lessening downtime.

In a three-part series of articles, we’ll focus on NGINX Plus – with a number of features above and beyond those in the open source NGINX software, it’s a more comprehensive and controllable solution for upgrades with zero downtime. This first article describes the two NGINX Plus features you can use for backend upgrades – the on-the-fly reconfiguration API and health checks – in detail and compares them to upgrading with the open source NGINX software.

The related articles explain how to use the methods for two classes of upgrades:

Choosing an Upgrade Method in NGINX Plus

NGINX Plus provides two methods for dynamically upgrading production servers and application version:

The two methods differ with respect to several factors, so the choice between them depends on your priorities:

In general, we recommend the NGINX Plus on-the-fly reconfiguration API for most use cases because changes take effect immediately and the API is fully scriptable and automatable.

Upgrading with Open Source NGINX

First, let’s review how upgrades work with the open source NGINX software, and explore some possible issues. Here you change upstream server groups by editing the upstream configuration block and reloading the configuration file. The configuration reload is seamless because a new set of worker processes are started to utilize the new configuration, while the existing worker processes continue to run and handle connections that were open when the reload occurred. Each old worker process terminates as soon as all its connections have completed. This design guarantees that no connections or requests are lost during the reload, and makes the reload method suitable even when upgrading NGINX itself from one version to another.

Depending on the nature of the outstanding connections, the time it takes to complete them all can range from just seconds to several minutes. If the configuration doesn’t change often, running two sets of workers for a short time usually has no bad effects. However, if changes (and consequently reloads) are very frequent, old workers might not finish processing requests and terminate before the next reload takes place, leaving multiple sets of workers running at once. With enough workers, you might eventually end up exhausting memory and hitting 100% CPU, particularly if you’re already optimizing use of resources by running your servers at close to capacity.

When you’re load balancing application servers, upstream groups are the part of the configuration that changes most frequently, whether it’s to scale capacity up and down, upgrade to a new version, or take servers offline for maintenance. Customers running hundreds of virtual servers load balancing traffic across thousands of backend servers might need to modify upstream groups very frequently. Using the reconfiguration API or health checks in NGINX Plus, you avoid the problem of frequent configuration reloads.

Overview of the NGINX Plus Upgrade Methods

The use cases discussed in the two related articles use one of the following methods, sometimes in combination with auxiliary actions.

Upgrading with the On-the-Fly Reconfiguration API

To use the on-the-fly reconfiguration API to manage the servers in an upstream group, you issue HTTP commands which all start with the following URL string. We’re using the conventional location name for the API, /upstream_conf, but you can configure a different name (see the section about the base configuration in the second or third article).

http://NGINX-server[:port]/upstream_conf?upstream=upstream-group-name

When you issue this command with no additional parameters, a list of the servers and their ID numbers is returned, as in this example for the use cases we’ll cover in the other two articles:

http://localhost:8080/upstream_conf?upstream=demoapp
server 172.16.210.81:80; # id=0
server 172.16.211.82:80; # id=1

To make changes to the servers in the upstream group, append other strings to the base URL as indicated:

Exit mobile version