others-how to solve `nginx: worker process is shutting down`?

1. Purpose

In this post, I would demo how to solve the following error when using nginx on linux server:

nginx: worker process is shutting down

2. The problem and solution

2.1 What is nginx?

Nginx (pronounced engine x) is open source Web server software that also performs reverse proxy, load balancing, email proxy and HTTP cache services. The software was originally created by Igor Sysoev as an answer to the challenge of handling 10,000 concurrent user connections: the C10k problem.

NGINX is open source software for web serving, reverse proxying, caching, load balancing, media streaming, and more. It started out as a web server designed for maximum performance and stability.

NGINX is a web server that can act as an email proxy, reverse proxy, and load balancer. The software’s structure is asynchronous and event-driven; which enables the processing of many requests at the same time. NGINX is highly scalable as well, meaning that its service grows along with its clients’ traffic.

2.2 What is process architecture in nginx?

NGINX has one master process and one or more worker processes. If caching is enabled, the cache loader and cache manager processes also run at startup.

The main purpose of the master process is to read and evaluate configuration files, as well as maintain the worker processes.

The worker processes do the actual processing of requests. NGINX relies on OS-dependent mechanisms to efficiently distribute requests among worker processes. The number of worker processes is defined by the worker_processes directive in the nginx.conf configuration file and can either be set to a fixed number or configured to adjust automatically to the number of available CPU cores.

A worker process is a single-threaded process. If Nginx is doing CPU-intensive work such as SSL or gzipping and you have 2 or more CPUs/cores, then you may set worker_processes to be equal to the number of CPUs or cores.

NGINX can run multiple worker processes, each capable of processing a large number of simultaneous connections. You can control the number of worker processes and how they handle connections with the following directives: worker_processes – The number of NGINX worker processes (the default is 1).

2.3 The worker process in nginx

NGINX uses a predictable process model that is tuned to the available hardware resources:

  • The master process performs the privileged operations such as reading configuration and binding to ports, and then creates a small number of child processes (the next three types).
  • The cache loader process runs at startup to load the disk‑based cache into memory, and then exits. It is scheduled conservatively, so its resource demands are low.
  • The cache manager process runs periodically and prunes entries from the disk caches to keep them within the configured sizes.
  • The worker processes do all of the work! They handle network connections, read and write content to disk, and communicate with upstream servers.

The NGINX configuration recommended in most cases – running one worker process per CPU core – makes the most efficient use of hardware resources. You configure it by setting the auto parameter on the worker_processes directive:

worker_processes auto;

When an NGINX server is active, only the worker processes are busy. Each worker process handles multiple connections in a nonblocking fashion, reducing the number of context switches.

Each worker process is single‑threaded and runs independently, grabbing new connections and processing them. The processes can communicate using shared memory for shared cache data, session persistence data, and other shared resources.

2.4 The problem

Someday, when I check the nginx process on my sever, I got this:

[[email protected] ~]# ps -ef|grep nginx
root     10966     1  0 02:10 ?        00:00:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx    10968 10966  0 02:10 ?        00:00:19 nginx: worker process is shutting down
nginx    12026 10966  1 02:35 ?        00:05:50 nginx: worker process
[[email protected] ~]#

I got stunned when I see this:

worker process is shutting down

Is my nginx server running fine? Does it have problems?

2.5 The solution

Let’s debug the problem, first we use pstack to print stack traces of the worker process that is shutting down:

Notice the 10968 is the pid of the nginx worker process that is shutting down.

[[email protected] ~]# pstack 10968
#0  0x00000030398e8ed3 in __epoll_wait_nocancel () from /lib64/libc.so.6
#1  0x00000000004376f5 in ?? ()
#2  0x000000000042d885 in ngx_process_events_and_timers ()
#3  0x0000000000435235 in ?? ()
#4  0x0000000000433684 in ngx_spawn_process ()
#5  0x00000000004353cc in ?? ()
#6  0x0000000000435c03 in ngx_master_process_cycle ()
#7  0x0000000000414086 in main ()
[[email protected] ~]#

This is not a problem at all. The reason for this is because I used service nginx reload when restarting nginx. This is a smooth restart solution that will not break the existing connections, and these nginx processes that maintain existing connections It will enter the nginx: worker process is shutting down state, just wait a while.

In order for nginx to re-read the configuration file, a HUP signal should be sent to the master process. The master process first checks the syntax validity, then tries to apply new configuration, that is, to open log files and new listen sockets. If this fails, it rolls back changes and continues to work with old configuration. If this succeeds, it starts new worker processes, and sends messages to old worker processes requesting them to shut down gracefully. Old worker processes close listen sockets and continue to service old clients. After all clients are serviced, old worker processes are shut down.

Let’s illustrate this by example. Imagine that nginx is run on FreeBSD and the command

ps axw -o pid,ppid,user,%cpu,vsz,wchan,command | egrep '(nginx|PID)'

produces the following output:

  PID  PPID USER    %CPU   VSZ WCHAN  COMMAND
33126     1 root     0.0  1148 pause  nginx: master process /usr/local/nginx/sbin/nginx
33127 33126 nobody   0.0  1380 kqread nginx: worker process (nginx)
33128 33126 nobody   0.0  1364 kqread nginx: worker process (nginx)
33129 33126 nobody   0.0  1364 kqread nginx: worker process (nginx)

If HUP is sent to the master process, the output becomes:

  PID  PPID USER    %CPU   VSZ WCHAN  COMMAND
33126     1 root     0.0  1164 pause  nginx: master process /usr/local/nginx/sbin/nginx
33129 33126 nobody   0.0  1380 kqread nginx: worker process is shutting down (nginx)
33134 33126 nobody   0.0  1368 kqread nginx: worker process (nginx)
33135 33126 nobody   0.0  1368 kqread nginx: worker process (nginx)
33136 33126 nobody   0.0  1368 kqread nginx: worker process (nginx)

One of the old worker processes with PID 33129 still continues to work. After some time it exits:

  PID  PPID USER    %CPU   VSZ WCHAN  COMMAND
33126     1 root     0.0  1164 pause  nginx: master process /usr/local/nginx/sbin/nginx
33134 33126 nobody   0.0  1368 kqread nginx: worker process (nginx)
33135 33126 nobody   0.0  1368 kqread nginx: worker process (nginx)
33136 33126 nobody   0.0  1368 kqread nginx: worker process (nginx)

3. Summary

In this post, I demonstrated how to solve the nginx process is shutting down problem. That’s it, thanks for your reading.

-->