php-fpm: graceful restart without blocking/losing requests #3758

m6w6 · 2019-01-22T13:01:00Z

Issue:

When php-fpm gets "graceful" reloaded it will stop accepting requests
while finishing reqests it already received. If those requests to finish
are of a long running kind like big uploads over a slow connection, this
will block new requests to be handled for an unacceptable long time.

Current behavior

|
v  FPM (OLD) running
|
v  $ sudo kill -USR2 $(cat /run/php-fpm/php-fpm.pid)
|
v  OLD sends SIGQUIT to children
v  OLD stops handling requests
|
v  OLD keeps running until all children complete or time out
|
v  OLD exec()s
v  FPM (NEW) starts & re-uses listening socket
|
v  NEW handles incoming requests from now on

Proposed Solution:

In the case of a reloading request, we immediately fork and exec a new
fpm instance to seamlessly handle new requests, while the old instance
keeps finishing those requests already accepted.

New behavior

|
v  FPM (OLD) running
|
v  $ sudo kill -USR2 $(cat /run/php-fpm/php-fpm.pid)
|
v  OLD sends SIGQUIT to children
v  OLD fork()s & exec()s
|\
| \
|  v  FPM (NEW) starts & re-uses listening socket
|  v  NEW handles incoming requests from now on
|  |
v  |  OLD keeps running until all children complete or time out
v  |  OLD gone
-  |
  /
 /
v  NEW still going
|

We've been running this patch since PHP 5.6 (early 2017).

m6w6 · 2019-01-23T12:44:22Z

I also noticed we're shipping an unused php-fpm.service template.

Maybe we should add && wait $MAINPID to the reload command?

/cc @remicollet

bukka

I think it makes sense for the mentioned case when there is a long running process that blocks the restart. I guess it might make the actual init scripts a bit more complex. The first thing that came to my mind is the collecting of logs. It might require some extra logic to collect it from both main processes. I'm not really sure how it's usually setup in distros so it would be really good to get some feedback from FPM distro maintainers.

krakjoe · 2019-01-31T05:43:50Z

This is really nice ...

More noise @remicollet, input please ?

frederikbosch · 2019-09-30T20:58:16Z

Isn't this basically a fix for #60691?

aboks · 2019-10-14T08:59:34Z

@bukka @krakjoe Is there anything I could do to help moving this fix forward?

bukka · 2020-06-09T21:59:13Z

I think this would be a nice addition for PHP 8 and we should try to get it in. I'm not sure we will hear from anyone in terms of init scripts so we shouldn't block it forever IMO.

@m6w6 if it's not too much hassle for you and would you be able to rebase it, that would be great! Also question if you have been using this with systemd?

cmb69 · 2020-07-15T15:12:16Z

The test failures appear to be legit.

m6w6 · 2020-07-16T06:21:10Z

Doh! I'll have a look, thanks.

## Issue: When php-fpm gets "graceful" reloaded it will stop accepting requests while finishing reqests it already received. If those requests to finish are of a long running kind like big uploads over a slow connection, this will block new requests to be handled for an unacceptable long time. ### Current behavior ``` | v FPM (OLD) running | v $ sudo kill -USR2 $(cat /run/php-fpm/php-fpm.pid) | v OLD sends SIGQUIT to children v OLD stops handling requests | v OLD keeps running until all children complete or time out | v OLD exec()s v FPM (NEW) starts & re-uses listening socket | v NEW handles incoming requests from now on ``` ## Proposed Solution: In the case of a reloading request, we immediately fork and exec a new fpm instance to seamlessly handle new requests, while the old instance keeps finishing those requests already accepted. ### New behavior ``` | v FPM (OLD) running | v $ sudo kill -USR2 $(cat /run/php-fpm/php-fpm.pid) | v OLD sends SIGQUIT to children v OLD fork()s & exec()s |\ | \ | v FPM (NEW) starts & re-uses listening socket | v NEW handles incoming requests from now on | | v | OLD keeps running until all children complete or time out v | OLD gone - | / / v NEW still going | ``` We've been running this patch since PHP 5.6 (early 2017).

m6w6 · 2020-07-16T09:38:14Z

I tried to fix the tests, but the output of FPM reloading is non-deterministic, because it depends on the brightness of the moon whether the old parent process first says bye-bye or the new one logs that it reuses the old socket.

On another note, FPM is run by proc_open which is of no use when testing reloading because of fork&exec. Observing output through the inherited file descriptors works, though, but process control won't.

cmb69 · 2020-07-16T09:45:39Z

If solid tests are not possible, I think it's better to remove them (or to skip them).

bukka · 2020-07-19T20:18:15Z

We could modify the tester to assert between messages from a set. It means we would pass 2 sets of messages and check input line matches one of them and then move a pointer in the set to the next expected message. Or maybe just check between all messages in the set so the order doesn't matter. That should be doable. I'd prefer not to delete the reloading tests as there has been quite an effort to make them right.

This issue actually reminded me the problem with logging when catch_workers_output = yes and error_log = /dev/stdout are set. Currently we have got the master process making sure that the messages are not interleaved. With this change, we won't be able to assure that because new and old master can be running in parallel and both write to stdout which means that the message can be interleaved. I think that fixing it would require some kind of extra locking which might add up to complexity quite a bit. Also I guess that the above setup is mainly for FPM running in docker where the reload is not usually done so might not be that big issue in reality.

However considering this and that we are also not sure about what different init systems will do and if it actually works fine and that there might be possibly some other edge cases I'd prefer to make this optional and non default. We could introduce a global config option that would set a new global variable which would be then use to decide what kind of reloading should be done. I think this can be possibly less work than trying to fix the logging issue and making all the reloading tests work (those could be done later possibly). It would be still good to have at least one test for the new reloading but that could be just something that it actually reloads.

m6w6 · 2020-08-06T12:30:27Z

Okay. I'll close this PR, let me explain why.

I think it is a hack. A hack based on another bad hack, which makes it just a worse hack.

The first hack is selling a graceful shutdown+reexec as a graceful reload - as it is now - potentially causing downtime instead of cut-off requests. But just cramming a fork in before the reexec causes confusion and init to loose track or even kill the new instance at last when the previous FPM exits.

After updating this PR the last time, I tried to make the graceful reloading FPM play nice with f.e. systemd, but after days trying to produce something simpler than the 300+ LOC shell shim of our ops team sitting between init and the graceful php-fpm, I ended up with something maybe even worse than that.

A few tests suggested that it works as supposed, but this is unacceptable in a general manner.
To make this PR work in a reliable way, FPM would have to change its current architecture where the master process is supervised by another process, a FPMM (haha). :lolsob:

langemeijer · 2021-07-02T08:13:01Z

I think I follow the reasoning why this didn't make it. So the current state is that graceful reload is not going to happen.

I think https://bugs.php.net/bug.php?id=60961 should be closed.
https://www.php.net/manual/en/install.fpm.php now states

These features include:

advanced process management with graceful stop/start;

But that is clearly not the case, I think this should be changed.

But then I have the following idea for a solution, maybe someone here can say if this is a feasible solution:

I would like to see php-fpm changed that on receiving a SIGQUIT on the main process we do the normal shutdown sequence, except that we close the listening sockets first. This way, systemd (or any other init system) can start a new php-fpm instance, not related to the php-fpm instance before.

The current situation is that php-fpm still has the listening sockets connected, but is never answering to new requests anyway. Holding the listening ports hostage is not functional, as soon as the the sockets can be closed the better.

With this implemented then at least graceful stop is working. In my opinion SIGQUIT should do graceful and SIGKILL should be used when all threads need to be closed, but obviously the graceful stop could also be implemented on any other signal.

Obviously, this still is not ideal. On sending a signal and starting a new php-fpm process, the listening ports could still be blocked, because the signal is not processed fully yet. starting php-fpm would fail and systemd should retry. Also this would leave a small 'gap' where for some time no socket is listening.
Still I think this would be better that what we have now.

petk added the Feature label Jan 22, 2019

nikic requested a review from bukka January 23, 2019 09:45

bukka self-assigned this Jan 23, 2019

bukka reviewed Jan 23, 2019

View reviewed changes

remicollet mentioned this pull request Jun 27, 2019

SIGSEGV on Fedora 30 / PHP7.3.6 with Phalcon and Opcache enabled remicollet/remirepo#121

Closed

bukka added the SAPI: fpm label Jun 9, 2020

m6w6 force-pushed the fpm-graceful branch from 4d32bf4 to 64e8759 Compare June 10, 2020 11:15

cmb69 added the Waiting on Author label Jul 15, 2020

m6w6 force-pushed the fpm-graceful branch from 5c68824 to 7c78375 Compare July 16, 2020 08:30

m6w6 force-pushed the fpm-graceful branch from 7c78375 to 372703c Compare July 16, 2020 08:38

m6w6 closed this Aug 6, 2020

bukka mentioned this pull request Jul 17, 2023

FPM pool manager #11723

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

php-fpm: graceful restart without blocking/losing requests #3758

php-fpm: graceful restart without blocking/losing requests #3758

m6w6 commented Jan 22, 2019

m6w6 commented Jan 23, 2019

bukka left a comment •

edited

Loading

krakjoe commented Jan 31, 2019

frederikbosch commented Sep 30, 2019

aboks commented Oct 14, 2019

bukka commented Jun 9, 2020 •

edited

Loading

cmb69 commented Jul 15, 2020

m6w6 commented Jul 16, 2020

m6w6 commented Jul 16, 2020

cmb69 commented Jul 16, 2020

bukka commented Jul 19, 2020

m6w6 commented Aug 6, 2020

langemeijer commented Jul 2, 2021

php-fpm: graceful restart without blocking/losing requests #3758

php-fpm: graceful restart without blocking/losing requests #3758

Conversation

m6w6 commented Jan 22, 2019

Issue:

Current behavior

Proposed Solution:

New behavior

m6w6 commented Jan 23, 2019

bukka left a comment • edited Loading

Choose a reason for hiding this comment

krakjoe commented Jan 31, 2019

frederikbosch commented Sep 30, 2019

aboks commented Oct 14, 2019

bukka commented Jun 9, 2020 • edited Loading

cmb69 commented Jul 15, 2020

m6w6 commented Jul 16, 2020

m6w6 commented Jul 16, 2020

cmb69 commented Jul 16, 2020

bukka commented Jul 19, 2020

m6w6 commented Aug 6, 2020

langemeijer commented Jul 2, 2021

bukka left a comment •

edited

Loading

bukka commented Jun 9, 2020 •

edited

Loading