-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fix ZTS crashes with persistent resources in modules #13381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
On shutdown in ZTS the following happens: - https://github.com/php/php-src/blob/master/Zend/zend.c#L1124-L1125 gets executed. This destroys global persistent resources and destroys the modules. Furthermore, the modules are unloaded too. - Further down, `ts_free_id(executor_globals_id)` gets executed, which calls `executor_globals_dtor`. This function destroys persistent resources for each thread. Notice that in the last step, the modules that the persistent resource belong to may already have been destroyed. This means that accessing globals will cause a crash (I previously fixed this with ifdef magic), or when the module is dynamically loaded we'll try jumping to a destructor that is no longer loaded in memory. These scenarios cause crashes. It's not possible to move the `ts_free_id` call upwards, because that may break assumptions of callers, and furthermore this would deallocate the executor globals structure, which means that any access to those will cause a segfault. This patch adds a new API to the TSRM that allows running a callback on a certain resource type. We use this API to destroy the persistent resources in all threads prior to the module destruction, and keep the rest of the resource dtor intact. I verified this fix on Apache with postgres, both dynamically and statically. Fixes phpGH-12974.
Zend/zend.c
Outdated
zend_destroy_rsrc_list(&EG(persistent_list)); | ||
#ifdef ZTS | ||
ts_callback_id(executor_globals_id, executor_globals_persistent_list_dtor); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably may be changed into (I'm not completely sure)
#ifndef ZTS
zend_destroy_rsrc_list(&EG(persistent_list));
#else
ts_callback_id(executor_globals_id, executor_globals_persistent_list_dtor);
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just quickly tested this by printing out all the lists that get destroyed, and I see that it now misses the global_persistent_list
whereas it previously was destroyed by zend_destroy_rsrc_list(&EG(persistent_list));
. So this seems wrong.
TSRM/TSRM.h
Outdated
/* Runs a callback on all resources of the given id. | ||
* The caller is responsible for ensuring the underlying resources don't data-race. */ | ||
TSRM_API void ts_callback_id(ts_rsrc_id id, void (*cb)(void *)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would propose to think about a better name. e.g. ts_apply_for_id()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this looks right.
I thought, that extensions might be responsible to destroy their resources in all threads before unloading (through GSHUTDOWN, but this may be to later). Anyway, this would only "fix" extensions loaded by dl()
, and this is already not compatible with ZTS.
Please, think in this direction a bit.
In any case the patch looks good enough.
I had thought of destroying persistent resources via module shutdown code, but it seems modules currently do not do that. |
For master (8.4-dev) I merged phpGH-13381. But that PR changes public API of TSRM, so cannot be used on lower branches. This patch is a safe workaround for the issue, in combination with a pre-existing fix using `ifdef ZTS + if (module_started)` inside pgsql and odbc. The idea is to delay unloading modules until the persistent resources are destroyed. This will keep the destructor code accessible in memory. This is not a proper fix on its own, because we still need the workaround of not accessing globals after module destruction. The proper fix is in master.
For master (8.4-dev) I merged GH-13381. But that PR changes public API of TSRM, so cannot be used on lower branches. This patch is a safe workaround for the issue, in combination with a pre-existing fix using `ifdef ZTS + if (module_started)` inside pgsql and odbc. The idea is to delay unloading modules until the persistent resources are destroyed. This will keep the destructor code accessible in memory. This is not a proper fix on its own, because we still need the workaround of not accessing globals after module destruction. The proper fix is in master. Closes GH-13388.
On shutdown in ZTS the following happens:
ts_free_id(executor_globals_id)
gets executed, which callsexecutor_globals_dtor
. This function destroys persistent resources for each thread.Notice that in the last step, the modules that the persistent resource belong to may already have been destroyed. This means that accessing globals will cause a crash (I previously fixed that for statically loaded modules with ifdef magic), or when the module is dynamically loaded we'll try jumping to a destructor that is no longer loaded in memory. These scenarios cause crashes.
It's not possible to move the
ts_free_id
call upwards, because that may break assumptions of callers, and furthermore this would deallocate the executor globals structure, which means that any access to those will cause a segfault.This patch adds a new API to the TSRM that allows running a callback on a certain resource type. We use this API to destroy the persistent resources in all threads prior to the module destruction, and keep the rest of the executor dtor intact.
I verified this fix on Apache with postgres, tested both with dynamically and statically loaded modules.
I tried to think of other ideas, but I couldn't find other simple ones that won't break BC.
Important: This patch is only for master because it changes public API and might be slightly dangerous.
For stable branches I propose https://gist.github.com/nielsdos/338774e915a1b528eaa8bd68ba7cc54c which is safer and does not change public API.
Fixes GH-12974. Although the issue only mentions postgres, other modules are affected too.