Skip to content

Fix GH-11498: SIGCHLD is not always returned from proc_open #11509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

nielsdos
Copy link
Member

Linux, and maybe other unixes, may merge multiple standard signals into a single one. This causes issues when keeping track of process IDs. Solve this by manually checking which children are dead using waitpid().

Test case is based on taka-oyama's test code.

@nielsdos nielsdos linked an issue Jun 22, 2023 that may be closed by this pull request
Linux, and maybe other unixes, may merge multiple standard signals into
a single one. This causes issues when keeping track of process IDs.
Solve this by manually checking which children are dead using waitpid().

Test case is based on taka-oyama's test code.
Copy link
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks correct to me, nice work!

while (true) {
do {
errno = 0;
pid = waitpid(WAIT_ANY, &status, WNOHANG);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux docs say that waitpid with WNOHANG can't EINTR, but then again https://www.gnu.org/software/libc/manual/html_node/Merged-Signals.html looks like the code this is proposing, so it's probably better to keep it. Also not sure about other Unix systems.

Btw, might be good to link the above somewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About Linux docs: yes, but POSIX doesn't specify that so we should indeed keep it :) https://pubs.opengroup.org/onlinepubs/9699919799/functions/waitpid.html
And yes, forgot to link that, my mistake.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll actually add that as a note too; otherwise we risk someone in the future going like "well this cannot happen, yeet"

@nielsdos nielsdos closed this in f39b513 Jun 23, 2023
nielsdos added a commit to nielsdos/php-src that referenced this pull request Jun 24, 2023
I forgot to add this in phpGH-11509.
@nielsdos nielsdos mentioned this pull request Jun 24, 2023
nielsdos added a commit that referenced this pull request Jun 25, 2023
I forgot to add this in GH-11509.

Closes GH-11526.
@andypost
Copy link
Contributor

andypost commented Jul 4, 2023

FYI building on musl the WAIT_ANY is missing in musl so I patched it with -1

@iluuu1994
Copy link
Member

@andypost Oh, thanks! We should have a musl build, but AFAIK Nikita tried and it was a pain to set up. @nielsdos Maybe we should just -1? The Linux docs also make no mention of WAIT_ANY, so I'm not sure how reliable this macro/constant is.

@nielsdos
Copy link
Member Author

nielsdos commented Jul 4, 2023

@andypost Oh, thanks! We should have a musl build, but AFAIK Nikita tried and it was a pain to set up. @nielsdos Maybe we should just -1? The Linux docs also make no mention of WAIT_ANY, so I'm not sure how reliable this macro/constant is.

Yes, let's go with -1

@iluuu1994
Copy link
Member

Thanks @nielsdos! I'll adjust.

@nielsdos
Copy link
Member Author

nielsdos commented Jul 4, 2023

Thanks @nielsdos! I'll adjust.

Thx

@andypost
Copy link
Contributor

andypost commented Jul 4, 2023

@iluuu1994 the issue with musl is that Mostly all distros based on it has rolling release cycle, so it will need to adjust test few times a year. Please point me what I can use as example, so I willing to give a try to add it using Alpinelinux stable release

@iluuu1994
Copy link
Member

@andypost That would be appreciated!
Nikitas first attempt: #5933
More attempts: #8621

Currently, the only Linux image provided by GitHub actions is Ubuntu, so we'll need to use Docker.

@andypost
Copy link
Contributor

andypost commented Jul 4, 2023

Thank you! All Alpinelinux infra using docker, except builders so there's preexisting images for sure but devil in nitpicks

@dicode-nl
Copy link

dicode-nl commented Aug 2, 2023

@iluuu1994 this commit changes the way how SIGCHLD is handled in PHP. Previously, upon receiving a SIGCHLD you could do a pcntl_wait loop to check for all terminated childs. This commit makes this impossible. Is this intended behavior as this is not BC.

For example, this no longer works:

<?php

pcntl_async_signals(true);
$childs = [];

pcntl_signal(SIGCHLD, function( $signo, $siginfo ) use (&$childs) {
	if ($signo == SIGCHLD ) {
		$pid = pcntl_wait($status);
		while( $pid > 0 ) {
			unset($childs[$pid]);
			$pid = pcntl_wait($status, WNOHANG);
		}
	}
});

if ( $pid = pcntl_fork() ) {
	$childs[$pid] = 1;
}
else {
	print "CHILD:SLEEP\n";
	sleep(rand(1,5));
	print "CHILD:EXIT\n";
	exit;
}

while(count($childs)) {
	print "MAIN:WAITING:" . count($childs) . "\n";
	sleep(300);
}

print "MAIN:FINISHED\n";

@iluuu1994
Copy link
Member

@dicode-nl Hmm. I did not consider this. The BC break was definitely not intended. I think PHP automatically splitting SIGCHLD is nice and results in less code, but if there's a BC break we might want to revert and postpone the commit to master. @nielsdos WDYT?

@nielsdos
Copy link
Member Author

nielsdos commented Aug 2, 2023

@dicode-nl Hmm. I did not consider this. The BC break was definitely not intended. I think PHP automatically splitting SIGCHLD is nice and results in less code, but if there's a BC break we might want to revert and postpone the commit to master. @nielsdos WDYT?

I didn't consider this either.
It's nice cleaning up the messy details of signals. But it also depends on what PCNTL is supposed to model: is it supposed to closely model whatever the OS gives us, and nothing more than that? In that case, we should revert the behaviour to what it was.
Also something I'm not sure about is how other OS's handle SIGCHILD merging. Some other OS's might not do this but Linux does, so if PHP doesn't account for the merging, and places the burden on its users, then handling SIGCHILD becomes more surprising & less portable for usercode...
I'm not sure what's best tbh.

@dicode-nl
Copy link

@iluuu1994 I'm not sure either which way to go. At least it should be documented. For me it feels more natural to call waitpid and others to get the child status, but this commit makes that virtually impossible.

@iluuu1994
Copy link
Member

iluuu1994 commented Aug 2, 2023

@nielsdos Good points. It seems like most functions in pcntl are just thin wrappers. Since the issue can be solved in PHP as demonstrated by @dicode-nl, it might be better to revert the change.

@nielsdos
Copy link
Member Author

nielsdos commented Aug 2, 2023

@nielsdos Good points. It seems like most functions in pcntl are just thin wrappers. Since the issue can be solved in PHP as demonstrated by @dicode-nl, it might be better to revert the change.

Ok for me

@nielsdos
Copy link
Member Author

nielsdos commented Aug 2, 2023

I've file the revert PR here: #11863

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SIGCHLD is not always returned from proc_open
4 participants