Site icon 지락문화예술공작단

More Fun with NGINX Plus Health Checks and Docker Containers

More Fun with NGINX Plus Health Checks and Docker Containers

At nginx.conf 2017, I gave a presentation on this topic, which you can access as a YouTube video or a blog post, which includes the Powerpoint slides and a transcription of my talk. In this blog post, I’ll describe an improved version of the basic approach, then give specific, working configuration code you can use to implement it yourself.

Introduction

When running containers in a microservices environment, your service instances may be susceptible to becoming overloaded due to resource limitations, such as memory or CPU utilization. A number of strategies can be employed to address this issue; this blog post addresses using NGINX Plus active health checks as one strategy.

We’ll focus on three different use cases:

All three of these methods work in the same fundamental way. A program is written to act as the active health check, called by NGINX Plus. Based on one of the methods above, this program will either return a status of healthy or unhealthy – causing NGINX Plus to either remove it from the load balancing rotation when it shows as unhealthy, or add it to the load balancing rotation when it shows as healthy.

Health Check Approaches

Let’s get into the details of each method. Code for the examples is available here.

For all of the examples, we’re using NGINX Plus as the load balancer and NGINX Unit as the application server, with two examples written in PHP and one written in Python. These are all running in Docker containers.

Request-Count-Based

For this method, a semaphore file, /tmp/busy, is created by the application as soon as a request is received, then removed when the request processing is completed. When you run the health check, it checks to see if the file exists. If the file is found, the health check returns a status of unhealthy, causing NGINX Plus to stop sending requests to the service instance. Once the request has been completed, the file is removed and the health check will show as healthy again.

For the example, a single Python program, testcnt.py, is used as the application, and to do the health check; the function to execute is governed by the URI.

The shortest interval between health checks is one second, so it may take up to one second for NGINX Plus to see that the service instance is busy. During that one second, NGINX Plus may send another request to the service instance. To handle this case, the application will return an HTTP status code of 503 when it receives a request while already busy processing another request. If this happens, NGINX Plus tries another upstream server.

CPU-Based

The Docker API can be used to get CPU usage metrics for a container, but the metrics returned are relative to the Docker host. In other works, if the Docker API reports that the CPU usage for a container is 25% of the CPU – that is 25% of the CPU for the Docker host.

For this example, we set a threshold of 70% for all the containers for this application, and divide that by the number of containers to get the threshold per container. For example, if there is one container it can use 70% of the Docker hosts’ CPU. If there are two containers, each can use up to 35% of the Docker hosts’ CPU.

The NGINX Plus Status API is used to get the number of containers for the application.

There are two PHP programs: testcpu.php, which is the application that generates CPU load, and hcheck.php, which does the health check.

To get statistics for a container, the health check page makes the following call to the Docker API on the Docker host:

http://Docker Host IP Address:Docker API Port/containers/Container ID/stats?stream=0

To calculate the CPU usage, two calls must be made to the API; one second apart, in this case. The cpu_stats.cpu_usage.total_usage field from these two calls is used to calculate the CPU usage.

Memory-Usage-Based

As in the CPU-based example, the Docker API is used to retrieve the memory usage metrics, Each container is limited to 128 megabytes of memory and the memory usage metrics are relative to this limit.

There are two PHP programs: testmem.php, which causes memory usage, and hcheck.php, which does the health check. If the memory usage is above 70%, the health check returns a status of unhealthy.

The health check makes the same Docker API call as shown for the CPU usage method, but to get the memory usage it uses the fields memory_stats.usage and memory_stats.stats.hierarchical_memory_limit. It calculates the memory utilization percentage as memory_stats.usage/memory_stats.stats.hierarchical_memory_limit.

NGINX Configuration

There are no changes required in the main NGINX configuration file (/etc/nginx/nginx.conf). If you want to see detailed messages in the error log for health checks, you should set the log level to info. For example:

error_log  /var/log/nginx/error.log info;

The following is the specific NGINX Plus configuration for the example applications. Please consider the following in reading and, potentially, reusing this configuration:

The application configuration (/etc/nginx/conf.d/backend.conf):

# Configure DNS.  Point to Consul
resolver consul:53 valid=2s;
resolver_timeout 2s;

# The upstreams will be populated via DNS
upstream unitcnt {
zone unitcnt 64k;
server service.consul service=unitcnt resolve;
}

upstream unitcpu {
zone unitcpu 64k;
server service.consul service=unitcpu resolve;
}

upstream unitmem {
zone unitmem 64k;
server service.consul service=unitmem resolve;
}

# All successful health checks will have a string starting with {"HealthCheck":"OK"
match server_ok {
status 200;
body ~ '{"HealthCheck":"OK"';
}

server {
# Allows calling upstream health checks directly
listen 80;
location /healthcheck {
proxy_pass http://$arg_server/hcheck.php;
}
location /healthcheckpy {
proxy_pass http://$arg_server/testcnt.py?healthcheck;
}
}

server {
listen 8001;
status_zone unitcnt;
root /usr/share/nginx/html;
proxy_http_version 1.1;
proxy_set_header Connection "";
location ~ .py$ {
proxy_set_header Host $http_host;
proxy_pass http://unitcnt;
proxy_intercept_errors on;
proxy_next_upstream http_503;
# If all the servers are busy return apibusy.html
error_page 502 503 =503 /apibusy.html;
health_check uri=/testcnt.py?healthcheck match=server_ok interval=1s;
}
}

server {
listen 8002;
status_zone unitcpu;
root /usr/share/nginx/html;
proxy_http_version 1.1;
proxy_set_header Connection "";
location ~ .php$ {
proxy_set_header Host $http_host;
proxy_pass http://unitcpu;
error_page 502 =503 /apibusy.html;
health_check uri=/hcheck.php match=server_ok interval=5s;
}
}

server {
listen 8003;
status_zone unitmem;
root /usr/share/nginx/html;
proxy_http_version 1.1;
proxy_set_header Connection "";
location ~ .php$ {
proxy_set_header Host $http_host;
proxy_pass http://unitmem;
error_page 502 =503 /apibusy.html;
health_check uri=/hcheck.php match=server_ok interval=3s;
}
}

# Configure the status API and dashboard
server {
listen 8082;

root /usr/share/nginx/html;

location = /dashboard.html {
}

location = / {
return 302 /dashboard.html;
}

location /api {
access_log off;
api;
}

}

Conclusion

NGINX Plus active health checks are an easy way of dealing with capacity limitations of services running in Docker, helping to make sure that service instances aren’t overloaded.

Get an NGINX Plus free trial and download the Unit beta and give it a try! All the code for the examples is available here.

The post More Fun with NGINX Plus Health Checks and Docker Containers appeared first on NGINX.

Source: More Fun with NGINX Plus Health Checks and Docker Containers

Exit mobile version