-
Notifications
You must be signed in to change notification settings - Fork 7.9k
GH-16889: stream_select() timeout useless for pipes on Windows #16917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pipes are blocking on Windows, but `php_select()` always returns them as ready for read/write. This renders the `stream_select()` timeout useless, what can cause a following read to block for a very long time. While there is no general fix (and least not within reach for a stable version), we can at least cater to the important case of read pipes by peeking the pipe to check whether data is available. If there is none, we do not add the handle to the read set.
The test assumes that at least the stdin and stdout pipes are always selected as readable, and that the select call will not change their order. We're being more defensive now.
This now waits indefinitely on Windows, what it is supposed to do according to man 2 select, since we're passing PS: ah, because reading files won't block there. PPS: it's not supposed to block indefinitely; we must not call |
* bug60602.phpt has the same issue as bug64770.phpt (copy&paste) * bug49936_win32.phpt should suppress the warnings[1], or be dropped altogether[2] [1] <php@c884d37> [2] <php@2c6b85f>
`FILE_TYPE_PIPE` will also be returned for sockets, but these have been filtered out already.
Given that I had to patch a couple of test cases, it might make sense to only apply this to master. However, in https://bugs.php.net/bug.php?id=51800#1561307300, @nicolas-grekas wrote:
|
@cmb69 I thought about this a bit and I'm not sure if this is an ideal solution. The problem is that for cases when the blocking is just very short (way below the timeout), the read readiness will just disappear which might break some cases that would be otherwise valid. If I understand it correctly, currently it would return immediately withand read would block for very short time. But after this change it will return also immediately but without the handle in the set so users might think it's not ready and just not wait any longer. Am I right? Could it be maybe done is some way of looped checking of the handle. Basically some sort of polling and checking the readiness until the timeout expires? |
Well, that would be a bug in the userland code, but I get your point. However, waiting till the timeout expires appears to be a bad idea, since the timeout is supposed to cater to some worst case scenarios. For instance, it is 60sec by default for run-tests.php. So maybe the best solution to keep compatibility with old code, but still support read pipes handled quickly, would be to change the behavior only if there are nothing but read handles in the set (plus possibly sockets, which should work well on Windows). I.e.: win32/select.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/win32/select.c b/win32/select.c
index 76d5ef1ee5..376aaf6a61 100644
--- a/win32/select.c
+++ b/win32/select.c
@@ -34,6 +34,7 @@ PHPAPI int php_select(php_socket_t max_fd, fd_set *rfds, fd_set *wfds, fd_set *e
HANDLE handles[MAXIMUM_WAIT_OBJECTS];
int handle_slot_to_fd[MAXIMUM_WAIT_OBJECTS];
int n_handles = 0, i;
+ int num_read_pipes = 0;
fd_set sock_read, sock_write, sock_except;
fd_set aread, awrite, aexcept;
int sock_max_fd = -1;
@@ -78,6 +79,9 @@ PHPAPI int php_select(php_socket_t max_fd, fd_set *rfds, fd_set *wfds, fd_set *e
sock_max_fd = i;
}
} else {
+ if (SAFE_FD_ISSET(i, rfds) && GetFileType(handles[n_handles]) == FILE_TYPE_PIPE) {
+ num_read_pipes++;
+ }
handle_slot_to_fd[n_handles] = i;
n_handles++;
}
@@ -136,7 +140,7 @@ PHPAPI int php_select(php_socket_t max_fd, fd_set *rfds, fd_set *wfds, fd_set *e
if (WAIT_OBJECT_0 == WaitForSingleObject(handles[i], 0)) {
if (SAFE_FD_ISSET(handle_slot_to_fd[i], rfds)) {
DWORD avail_read = 0;
- if (GetFileType(handles[i]) != FILE_TYPE_PIPE
+ if (num_read_pipes < n_handles
|| !PeekNamedPipe(handles[i], NULL, 0, NULL, &avail_read, NULL)
|| avail_read > 0
) {
@@ -156,6 +160,9 @@ PHPAPI int php_select(php_socket_t max_fd, fd_set *rfds, fd_set *wfds, fd_set *e
}
}
}
+ if (retcode == 0 && num_read_pipes == n_handles && sock_max_fd < 0) {
+ usleep(100);
+ }
} while (retcode == 0 && (ms_total == INFINITE || GetTickCount64() < limit));
if (rfds) { |
@cmb69 I was looking through the code and I'm not exactly sure what's the actual point to even call WaitForSingleObject on pipe if it returns as ready. Or can it actually return anything else? What I was thinking was to do those short interval waits and peek the pipe until there is at least one ready. This should happen only if pipe handlers is being check. Pretty much what is suggested in your diff except I don't understand why you removed |
The calls to That additional patch first checks whether all handles are pipes ( Note that with or without the additional patch this is merely a work-around. The clean solution would be using asynchronous IO, but that is not even supported for anonymous pipes on Windows … |
Shouldn't the pipes always peek so the rset contains the ones that are ready? Why would it now peek only if all handles are pipe? It makes sense to wait only if all handles are pipes but peek should be done nevertheless IMHO. Unless I'm misunderstanding something. |
Before this PR had been merged, passing anything but sockets (and waitable objects) to
This is a valid concern. A pipe in the rset might not be ready, but pipes in the wset (or eset) would reported as being ready. The additional patch caters to that: it is fully backwards compatible in case pipes (or files etc.) are passed to rset and wset (or eset): Now, we could sleep for some time (how long?) and check whether a pipe in rset is ready; but we cannot know whether a pipe in wset (or eset) is ready, so would need to report these as ready anyway. What would that improve? Nothing. Users might have to wait for a pipe in wset being reported as ready, but writing to this pipe could still block. As I said, there is no general solution regarding portability; but the patch at least fixes an apparently not uncommon case, where only pipes are passed to rset, e.g. used in run-tests.php on stdout and stderr, and I figure that there may be code which spawns several children, and now can properly select on their output. |
So my main worry is about this scenario: $rset = [$rpipe];
$wset = null;
$eset = null;
if (stream_select($rset, $wset, $eset, 2) > 0) {
echo $fread($rpipe, 100);
} And consider that $rpipe is not ready for 100ms. As I understand it, currently it returns immediately and read from $rpipe (fread blocks for 100ms). After your change, it also returns immediately but no fread because it won't get anything from peek. Is that right? If so, the problem is that users expect waiting for max 2s and they might consider no point to call fread because it suggests it already waited 2s so suddenly they will consider this like a breaking change because it will start timing out (even though there is no wait and the time out is immediate but it won't read anything). |
No. Since peek wouldn't tell that data is available, no fd would be selected, so To wit: <?php
$desc = [
["null"],
["pipe", "w"],
["null"]
];
$proc = proc_open("php -r \"sleep(1); echo 'hello';\"", $desc, $pipes);
$rpipe = $pipes[1];
$rset = [$rpipe];
$wset = null;
$eset = null;
if (stream_select($rset, $wset, $eset, 2) > 0) {
echo fread($rpipe, 100);
} outputs after ~1sec:
multiplexing read pipes example<?php
$desc = [
["null"],
["pipe", "w"],
["null"]
];
$read_pipes = [];
for ($i = 0; $i < 10; $i++) {
$procs[] = proc_open(["php", "-r", "sleep(1 * (10 - $i)); echo 'hello$i';"], $desc, $pipes);
$read_pipes[] = $pipes[1];
}
$rset = [...$read_pipes];
$wset = null;
$eset = null;
while (!empty($read_pipes) && stream_select($rset, $wset, $eset, 2) > 0) {
foreach ($rset as $pipe) {
echo fread($pipe, 6), "\n";
unset($read_pipes[array_search($pipe, $read_pipes)]);
}
$rset = [...$read_pipes];
} This has now the same behavior as on Linux: test takes about 10 secs, and prints hello9, hello8, …, hello0 (one line every second). Prior to this patch, it would "do nothing" for 10 secs, and then print hello0, hello1, …, hello9 in one go. |
Ah I didn't realise that there is a busy loop . There should definitely be some delay as suggested in the follow so if you could create a PR for that, it would be great! |
The original patch[1] cared only about pipe handles in the rset, but would be problematic if there are other handles (e.g. files in the rset, or pipes/files in the other sets), because `php_select()` would return immediately, reporting all non read-pipe handles as ready, but possibly never reporting read-pipe handles. We fix this by applying different logic for the case where only pipe handles are supplied in the rset, but no handles in the wset or eset. In this case `php_select()` only returns when actually one of the handles is ready, or when the timeout expires. To avoid busy looping in this case, we sleep for a short amount of time. This matches POSIX behavior. In all other cases, `php_select()` behaves as before (i.e. prior to the original fix), that is it returns immediately, reporting all handles as ready. We also add a test case that demonstrates multiplexing the output of a couple of child processes. See also the discussion on <php#16917>. [1] <php@b614b4a>
See #17174. |
The original patch[1] cared only about pipe handles in the rset, but would be problematic if there are other handles (e.g. files in the rset, or pipes/files in the other sets), because `php_select()` would return immediately, reporting all non read-pipe handles as ready, but possibly never reporting read-pipe handles. We fix this by applying different logic for the case where only pipe handles are supplied in the rset, but no handles in the wset or eset. In this case `php_select()` only returns when actually one of the handles is ready, or when the timeout expires. To avoid busy looping in this case, we sleep for a short amount of time. This matches POSIX behavior. In all other cases, `php_select()` behaves as before (i.e. prior to the original fix), that is it returns immediately, reporting all handles as ready. We also add a test case that demonstrates multiplexing the output of a couple of child processes. See also the discussion on <#16917>. [1] <b614b4a> Closes GH-17174.
Pipes are blocking on Windows, but
php_select()
always returns them as ready for read/write. This renders thestream_select()
timeout useless, what can cause a following read to block for a very long time.While there is no general fix (and least not within reach for a stable version), we can at least cater to the important case of read pipes by peeking the pipe to check whether data is available. If there is none, we do not add the handle to the read set.