Skip to content

Measure startup, each bench run and shutdown separately #13162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open
9 changes: 5 additions & 4 deletions .github/workflows/push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -231,12 +231,12 @@ jobs:
sudo apt-get update
sudo apt-get install \
bison \
libgmp-dev \
libonig-dev \
libsqlite3-dev \
openssl \
re2c \
valgrind
valgrind \
debuginfod
- name: ccache
uses: hendrikmuhs/[email protected]
with:
Expand All @@ -255,7 +255,6 @@ jobs:
--enable-werror \
--prefix=/usr \
--with-config-file-scan-dir=/etc/php.d \
--with-gmp \
--with-mysqli=mysqlnd \
--with-openssl \
--with-pdo-sqlite \
Expand Down Expand Up @@ -287,7 +286,9 @@ jobs:
ssh-key: ${{ secrets.BENCHMARKING_DATA_DEPLOY_KEY }}
path: benchmark/repos/data
- name: Benchmark
run: php benchmark/benchmark.php true
run: |
export DEBUGINFOD_URLS="https://debuginfod.ubuntu.com"
php benchmark/benchmark.php true
- name: Store result
if: github.event_name == 'push'
run: |
Expand Down
81 changes: 68 additions & 13 deletions benchmark/benchmark.php
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ function runSymfonyDemo(bool $jit): array {
cloneRepo($dir, 'https://github.com/php/benchmarking-symfony-demo-2.2.3.git');
runPhpCommand([$dir . '/bin/console', 'cache:clear']);
runPhpCommand([$dir . '/bin/console', 'cache:warmup']);
return runValgrindPhpCgiCommand('symfony-demo', [$dir . '/public/index.php'], cwd: $dir, jit: $jit, warmup: 50, repeat: 50);

return runValgrindPhpCgiCommand('symfony-demo', [$dir . '/public/index.php'], cwd: $dir, jit: $jit, repeat: 100);
}

function runWordpress(bool $jit): array {
Expand All @@ -86,7 +87,8 @@ function runWordpress(bool $jit): array {

// Warmup
runPhpCommand([$dir . '/index.php'], $dir);
return runValgrindPhpCgiCommand('wordpress', [$dir . '/index.php'], cwd: $dir, jit: $jit, warmup: 50, repeat: 50);

return runValgrindPhpCgiCommand('wordpress', [$dir . '/index.php'], cwd: $dir, jit: $jit, repeat: 100);
}

function runPhpCommand(array $args, ?string $cwd = null): ProcessResult {
Expand All @@ -98,41 +100,94 @@ function runValgrindPhpCgiCommand(
array $args,
?string $cwd = null,
bool $jit = false,
int $warmup = 0,
int $repeat = 1,
): array {
global $phpCgi;

$profileOut = __DIR__ . "/profiles/callgrind.out.$name";
$profileOut = __DIR__ . "/profiles/callgrind.$name";
if ($jit) {
$profileOut .= '.jit';
$profileOut .= '-jit';
}

$process = runCommand([
'valgrind',
'--tool=callgrind',
'--dump-instr=yes',
'--collect-jumps=yes',
'--trace-children=yes',
"--callgrind-out-file=$profileOut",
'--verbose',
'--',
$phpCgi,
'-T' . ($warmup ? $warmup . ',' : '') . $repeat,
'-T' . $repeat,
'-d max_execution_time=0',
'-d opcache.enable=1',
'-d opcache.jit=' . ($jit ? 'tracing' : 'disable'),
'-d opcache.jit_buffer_size=128M',
'-d opcache.validate_timestamps=0',
...$args,
]);
$instructions = extractInstructionsFromValgrindOutput($process->stderr);
if ($repeat > 1) {
$instructions = gmp_strval(gmp_div_q($instructions, $repeat));
], null, array_merge(getenv(), ['BENCHMARK_DUMP_SEPARATE_PROFILES' => '1']));

// collect metrics for startup, each benchmark run and shutdown
$metricsArr = [];
foreach (['startup' => 1, ...range(2, $repeat + 1), 'shutdown' => ''] as $k => $kCallgrindOut) {
$profileOutSpecific = $profileOut . '.' . $k;
rename($profileOut . ($kCallgrindOut === '' ? '' : '.' . $kCallgrindOut), $profileOutSpecific);

$metricsArr[$k] = extractMetricsFromCallgrindFile($profileOutSpecific);
}

// print all collected metrics
print_r(array_map(fn ($metrics) => array_map(fn ($v) => number_format($v, 0, '.', '_'), $metrics), $metricsArr));

// find median benchmark run
$metricsForMedianArr = $metricsArr;
unset($metricsForMedianArr['startup']);
unset($metricsForMedianArr['shutdown']);
uasort($metricsForMedianArr, fn ($a, $b) => $a['Ir'] <=> $b['Ir']);
$medianRunIndex = array_keys($metricsForMedianArr)[max(0, floor((count($metricsForMedianArr) - 3 /* -1 for count to index, -1 for first slow run due compliation, -1 for second run which is a little slower too */) / 2.0))];

// remove non-first-non-median profiles from artifacts
foreach (range(0, $repeat - 1) as $k) {
$profileOutSpecific = $profileOut . '.' . $k;

if ($k !== 0 && $k !== $medianRunIndex) {
unlink($profileOutSpecific);
}
}

// annotate profiles for artifacts
foreach (['startup', 0, $medianRunIndex, 'shutdown'] as $k) {
$profileOutSpecific = $profileOut . '.' . $k;

runCommand([
'callgrind_annotate',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea how to convince callgrind_annotate to resolve all symbols, especially all the onces defined in Sqlite lib? libsqlite3-dev is installed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would need to compile with CFLAGS="-g -O2" to get debug symbols. But with multiple repetitions we don't generally care about startup/shutdown. It's not very meaningful for applications ran with opcache. If you want to provide startup/shutdown you can do so without repetitions.

Copy link
Contributor Author

@mvorisek mvorisek Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

in the example you can see some examples of unresolved functions (from callgrind.symfony-demo.0 profile). Will CFLAGS="-g -O2" really help if other functions are resolved well?

All unresolved functions are listed in the end of each annotated profile file (like callgrind.symfony-demo.0.txt).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for SQLite, since we're not compiling its source files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug symbols for Debian packages are commonly available via the corresponding -dbgsym package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Ubuntu docker, after apt update, I get only these:

# apt-cache search libsqlite
libsqlite3-0 - SQLite 3 shared library
libsqlite3-dev - SQLite 3 development files
libsqlite3-mod-ceph - SQLite3 VFS for Ceph
libsqlite3-mod-ceph-dev - SQLite3 VFS for Ceph (development files)
libsqlite-tcl - SQLite 2 Tcl bindings
libsqlite0 - SQLite 2 shared library
libsqlite0-dev - SQLite 2 development files
libsqlite3-gst - SQLite bindings for GNU Smalltalk
libsqlite3-mod-blobtoxy - SQLite3 extension module for read-only BLOB to X/Y mapping
libsqlite3-mod-csvtable - SQLite3 extension module for read-only access to CSV files
libsqlite3-mod-impexp - SQLite3 extension module for SQL script, XML, JSON and CSV import/export
libsqlite3-mod-rasterlite2 - SQLite 3 module for huge raster coverages
libsqlite3-mod-spatialite - Geospatial extension for SQLite - loadable module
libsqlite3-mod-virtualpg - Loadable dynamic extension to both SQLite and SpatiaLite
libsqlite3-mod-xpath - SQLite3 extension module for querying XML data with XPath
libsqlite3-mod-zipfile - SQLite3 extension module for read-only access to ZIP files
libsqlite3-ocaml - Embeddable SQL Database for OCaml Programs (runtime)
libsqlite3-ocaml-dev - Embeddable SQL Database for OCaml Programs (development)
libsqlite3-tcl - SQLite 3 Tcl bindings
libsqliteodbc - ODBC driver for SQLite embedded database

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!! That link brought me to https://valgrind.org/docs/manual/manual-core.html#manual-core.debuginfod . Commited in d4fe3ef.

Now I still see some missing source for the core linux libs:


--------------------------------------------------------------------------------
The following files chosen for auto-annotation could not be found:
--------------------------------------------------------------------------------
  ./csu/../csu/libc-start.c
  ./csu/../sysdeps/nptl/libc_start_call_main.h
  ./csu/./csu/errno-loc.c
  ./ctype/../include/ctype.h
  ./debug/./debug/explicit_bzero_chk.c
  ./elf/../sysdeps/generic/dl-protected.h
  ./elf/../sysdeps/generic/ldsodefs.h
  ./elf/../sysdeps/x86_64/dl-machine.h
  ./elf/../sysdeps/x86_64/dl-runtime.h
  ./elf/../sysdeps/x86_64/dl-trampoline.h
  ./elf/./elf/dl-lookup.c
  ./elf/./elf/dl-runtime.c
  ./io/../sysdeps/posix/getcwd.c
  ./io/../sysdeps/unix/syscall-template.S
...
  ./io/../sysdeps/unix/sysv/linux/write.c
  ./libio/./libio/fileops.c
  ./libio/./libio/iofflush.c
  ./libio/./libio/libioP.h
  ./malloc/./malloc/arena.c
  ./malloc/./malloc/malloc.c
  ./malloc/./malloc/morecore.c
  ./misc/../sysdeps/unix/syscall-template.S
  ./misc/../sysdeps/unix/sysv/linux/brk.c
  ./misc/../sysdeps/unix/sysv/linux/x86_64/syscall.S
  ./misc/./misc/sbrk.c
  ./misc/./misc/syslog.c
...
  ./stdlib/../stdlib/strtol.c
  ./stdlib/../stdlib/strtol_l.c
  ./stdlib/./stdlib/getenv.c
  ./string/../sysdeps/x86_64/multiarch/memchr-avx2.S
...
  /build/sqlite3-sDRO1D/sqlite3-3.37.2/./src/main.c
  /build/sqlite3-sDRO1D/sqlite3-3.37.2/./src/malloc.c
  /build/sqlite3-sDRO1D/sqlite3-3.37.2/./src/mem1.c
  /build/sqlite3-sDRO1D/sqlite3-3.37.2/./src/memjournal.c
  /build/sqlite3-sDRO1D/sqlite3-3.37.2/./src/mutex.c
  /build/sqlite3-sDRO1D/sqlite3-3.37.2/./src/mutex_unix.c
  /build/sqlite3-sDRO1D/sqlite3-3.37.2/./src/notify.c
  /build/sqlite3-sDRO1D/sqlite3-3.37.2/./src/os.c
...

any idea?

'--threshold=100',
'--auto=yes',
'--show-percs=no',
$profileOutSpecific,
new UnescapedArg('>'),
"$profileOutSpecific.txt",
]);
}

$instructions = $metricsArr[$medianRunIndex]['Ir'];
if ($repeat === 1) { // return the same data as before https://github.com/php/php-src/pull/13162
$instructions += $metricsArr['startup']['Ir'];
}

return ['instructions' => $instructions];
}

function extractInstructionsFromValgrindOutput(string $output): string {
preg_match("(==[0-9]+== Events : Ir\n==[0-9]+== Collected : (?<instructions>[0-9]+))", $output, $matches);
return $matches['instructions'] ?? throw new \Exception('Unexpected valgrind output');
/**
* @return array<non-empty-string, numeric-string>
*/
function extractMetricsFromCallgrindFile(string $path): array {
if (!preg_match('/\nevents:((?: +\w+)+)\nsummary:((?: +\d+)+)\n/', file_get_contents($path, length: 10_000), $matches)) {
throw new \Exception('Unexpected callgrind data');
}

return array_combine(explode(' ', ltrim($matches[1], ' ')), explode(' ', ltrim($matches[2], ' ')));
}

main();
33 changes: 24 additions & 9 deletions benchmark/shared.php
Original file line number Diff line number Diff line change
@@ -1,17 +1,28 @@
<?php

class ProcessResult {
public $stdout;
public $stderr;
readonly class ProcessResult {
public function __construct(
public string $stdout,
public string $stderr,
) {}
}

function runCommand(array $args, ?string $cwd = null): ProcessResult {
$cmd = implode(' ', array_map('escapeshellarg', $args));
readonly class UnescapedArg {
public function __construct(
public string $arg,
) {}
}

function runCommand(array $args, ?string $cwd = null, ?array $envVars = null): ProcessResult {
$cmd = implode(' ', array_map(function (string|UnescapedArg $v): string {
return $v instanceof UnescapedArg
? $v->arg
: escapeshellarg($v);
}, $args));
$pipes = null;
$result = new ProcessResult();
$descriptorSpec = [0 => ['pipe', 'r'], 1 => ['pipe', 'w'], 2 => ['pipe', 'w']];
fwrite(STDOUT, "> $cmd\n");
$processHandle = proc_open($cmd, $descriptorSpec, $pipes, $cwd ?? getcwd(), null);
$processHandle = proc_open($cmd, $descriptorSpec, $pipes, $cwd ?? getcwd(), $envVars);

$stdin = $pipes[0];
$stdout = $pipes[1];
Expand All @@ -22,6 +33,8 @@ function runCommand(array $args, ?string $cwd = null): ProcessResult {
stream_set_blocking($stdout, false);
stream_set_blocking($stderr, false);

$stdoutStr = '';
$stderrStr = '';
$stdoutEof = false;
$stderrEof = false;

Expand All @@ -35,9 +48,9 @@ function runCommand(array $args, ?string $cwd = null): ProcessResult {
foreach ($read as $stream) {
$chunk = fgets($stream);
if ($stream === $stdout) {
$result->stdout .= $chunk;
$stdoutStr .= $chunk;
} elseif ($stream === $stderr) {
$result->stderr .= $chunk;
$stderrStr .= $chunk;
}
}

Expand All @@ -48,6 +61,8 @@ function runCommand(array $args, ?string $cwd = null): ProcessResult {
fclose($stdout);
fclose($stderr);

$result = new ProcessResult($stdoutStr, $stderrStr);

$statusCode = proc_close($processHandle);
if ($statusCode !== 0) {
fwrite(STDOUT, $result->stdout);
Expand Down
41 changes: 23 additions & 18 deletions sapi/cgi/cgi_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -1696,6 +1696,16 @@ PHP_FUNCTION(apache_response_headers) /* {{{ */
}
/* }}} */

#ifdef HAVE_VALGRIND
static inline void callgrind_dump_stats(void)
{
char *tmp = getenv("BENCHMARK_DUMP_SEPARATE_PROFILES");
if (tmp && ZEND_ATOL(tmp)) {
CALLGRIND_DUMP_STATS;
}
}
#endif

static zend_module_entry cgi_module_entry = {
STANDARD_MODULE_HEADER,
"cgi-fcgi",
Expand Down Expand Up @@ -2216,13 +2226,6 @@ consult the installation file that came with this distribution, or visit \n\
if (comma) {
warmup_repeats = atoi(php_optarg);
repeats = atoi(comma + 1);
#ifdef HAVE_VALGRIND
if (warmup_repeats > 0) {
CALLGRIND_STOP_INSTRUMENTATION;
/* We're not interested in measuring startup */
CALLGRIND_ZERO_STATS;
}
#endif
} else {
repeats = atoi(php_optarg);
}
Expand Down Expand Up @@ -2264,6 +2267,13 @@ consult the installation file that came with this distribution, or visit \n\
}
#endif
while (!fastcgi || fcgi_accept_request(request) >= 0) {
#ifdef HAVE_VALGRIND
if (benchmark) {
/* measure startup and each benchmark run separately */
callgrind_dump_stats();
}
#endif

SG(server_context) = fastcgi ? (void *)request : (void *) 1;
init_request_info(request);

Expand Down Expand Up @@ -2430,12 +2440,6 @@ consult the installation file that came with this distribution, or visit \n\
}
} /* end !cgi && !fastcgi */

#ifdef HAVE_VALGRIND
if (warmup_repeats == 0) {
CALLGRIND_START_INSTRUMENTATION;
}
#endif

/* request startup only after we've done all we can to
* get path_translated */
if (php_request_startup() == FAILURE) {
Expand Down Expand Up @@ -2554,11 +2558,6 @@ consult the installation file that came with this distribution, or visit \n\
SG(request_info).query_string = NULL;
}

#ifdef HAVE_VALGRIND
/* We're not interested in measuring shutdown */
CALLGRIND_STOP_INSTRUMENTATION;
#endif

if (!fastcgi) {
if (benchmark) {
if (warmup_repeats) {
Expand Down Expand Up @@ -2586,6 +2585,12 @@ consult the installation file that came with this distribution, or visit \n\
script_file = NULL;
goto do_repeat;
}

#ifdef HAVE_VALGRIND
/* measure shutdown separately */
callgrind_dump_stats();
#endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned, I don't think dumping stats to stdout/stderr repeatedly is a good solution, because it makes Valgrind useless for all other use-cases where this script is not used. A better solution would be if we could calculate the mean or whatever metric we find most useful, from C, and output that when the script terminates.

Valgrind allows specifying the output file using --log-file="filename". This would allow accessing the output from C. Unfortunately, there seems to be no way to set this programmatically, so it would have to be set manually when invoking PHP with Valgrind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, --callgrind-out-file= cannot be accessed/detected as it is not exported.

What do you want to achive, does CALLGRIND_DUMP_STATS dump stats even if callgrind is not enabled? Also notice, all CALLGRIND_DUMP_STATS are called only if CGI benchmark is (-T) is enabled.

A better solution would be if we could calculate the mean or whatever metric we find most useful, from C, and output that when the script terminates.

This is not possible with Valgrind and also probably not wanted, as having multiple Callgrind files is helpful to investigate time differences.

Copy link
Member

@iluuu1994 iluuu1994 Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that Dmitry and others use valgrind php-cgi -Tn ... for things other than Symfony/Wordpress/etc. However, in this case the script will now dump the stats in every iteration, requiring manually extracting the values and calculating some mean. So the output becomes essentially useless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 9d77734 I have put the CALLGRIND_DUMP_STATS call in an env. var condition. If the env. var is not present, nothing is dumped during the run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iluuu1994 can I ask you to review this PR

break;
}

Expand Down