Skip to content

Commit 11ee898

Browse files
Support user-defined benchmark suites. (#109)
Currently pyperformance is two things coupled together: a tool to run a Python benchmark suite and a curated suite of Python benchmarks. This PR splits those apart, with the existing suite used as the default. This allows users to run their own set of benchmarks, perhaps specific to their Python implementation or their PyPI library, e.g. https://github.com/ericsnowcurrently/pyston-macrobenchmarks/tree/pyperformance. Key changes: * introduce a new filesystem structure for suites and individual benchmarks * add the --manifest CLI option to specify the custom suite to use * the default suite has been changed to the new format and moved to pyperformance/_benchmarks (only a data dir now) * sometimes run some benchmarks in separate venvs * do not fail all benchmarks if the dependencies for one cannot be installed Most notably, this change should not affect benchmark results.
1 parent cb9463d commit 11ee898

File tree

176 files changed

+3350
-793
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

176 files changed

+3350
-793
lines changed

BENCHMARKS_FORMAT.md

Lines changed: 335 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,335 @@
1+
# The pyperformance File Formats
2+
3+
`pyperformance` uses two file formats to identify benchmarks:
4+
5+
* manifest - a set of benchmarks
6+
- metadata - a single benchmark
7+
8+
For each benchmark, there are two required files and several optional
9+
ones. Those files are expected to be in a specific directory structure
10+
(unless customized in the metadata).
11+
12+
The structure (see below) is such that it's easy to maintain
13+
a benchmark (or set of benchmarks) on GitHub and distribute it on PyPI.
14+
It also simplifies publishing a Python project's benchmarks.
15+
The alternative is pointing people at a repo.
16+
17+
Benchmarks can inherit metadata from other metadata files.
18+
This is useful for keeping common metadata for a set of benchmarks
19+
(e.g. "version") in one file. Likewise, benchmarks for a Python
20+
project can inherit metadata from the project's pyproject.toml.
21+
22+
Sometimes a benchmark will have one or more variants that run using
23+
the same script. Variants like this are supported by `pyperformance`
24+
without requiring much extra effort.
25+
26+
27+
## Benchmark Directory Structure
28+
29+
Normally a benchmark is structured like this:
30+
31+
```
32+
bm_NAME/
33+
data/ # if needed
34+
requirements.txt # lock file, if any
35+
pyproject.toml
36+
run_benchmark.py
37+
```
38+
39+
(Note the "bm\_" prefix on the directory name.)
40+
41+
"pyproject.toml" holds the metadata. "run_benchmark.py" holds
42+
the actual benchmark code. Both are necessary.
43+
44+
`pyperformance` treats the metadata file as the fundamental source of
45+
information about a benchmark. A manifest for a set of benchmarks is
46+
effectively a mapping of names to metadata files. So a metadata file
47+
is essential. It can be located anywhere on disk. However, if it
48+
isn't located in the structure described above then the metadata must
49+
identify where to find the other files.
50+
51+
Other than that, only a benchmark script (e.g. "run_benchmark.py" above)
52+
is required. All other files are optional.
53+
54+
When a benchmark has variants, each has its own metadata file next to
55+
the normal "pyproject.toml", named "bm_NAME.toml". (Note the "bm\_" prefix.)
56+
The format of variant metadata files is exactly the same. `pyperformance`
57+
treats them the same, except that the sibling "pyproject.toml" is
58+
inherited by default.
59+
60+
61+
## Manifest Files
62+
63+
A manifest file identifies a set of benchmarks, as well as (optionally)
64+
how they should be grouped. `pyperformance` uses the manifest to
65+
determine which benchmarks are available to run (and thus which to run
66+
by default).
67+
68+
A manifest normally looks like this:
69+
70+
```
71+
[benchmarks]
72+
73+
name metafile
74+
bench1 somedir/bm_bench1/pyproject.toml
75+
bench2 somedir/pyproject.toml
76+
bench3 ../anotherdir
77+
```
78+
79+
The "benchmarks" section is a table with rows of tab-separated-values.
80+
The "name" value is how `pyperformance` will identify the benchmark.
81+
The "metafile" value is where `pyperformance` will look for the
82+
benchmark's metadata. If a metafile is a directory then it looks
83+
for "pyproject.toml" in that directory.
84+
85+
86+
### Benchmark Groups
87+
88+
The other sections in the manifest file relate to grouping:
89+
90+
```
91+
[benchmarks]
92+
93+
name metafile
94+
bench1 somedir/bm_bench1
95+
bench2 somedir/bm_bench2
96+
bench3 anotherdir/mybench.toml
97+
98+
[groups]
99+
tag1
100+
tag2
101+
102+
[group default]
103+
bench2
104+
bench3
105+
106+
[group tricky]
107+
bench2
108+
```
109+
110+
The "groups" section specifies available groups that may be identified
111+
by benchmark tags (see about tags in the metadata section below). Any
112+
other group sections in the manifest are automatically added to the list
113+
of available groups.
114+
115+
If no "default" group is specified then one is automatically added with
116+
all benchmarks from the "benchmarks" section in it. If there is no
117+
"groups" section and no individual group sections (other than "default")
118+
then the set of all tags of the known benchmarks is treated as "groups".
119+
A group named "all" as also automatically added which has all known
120+
benchmarks in it.
121+
122+
Benchmarks can be excluded from a group by using a `-` (minus) prefix.
123+
Any benchmark alraedy in the list (at that point) that matches will be
124+
dropped from the list. If the first entry in the section is an
125+
exclusion then all known benchmarks are first added to the list
126+
before the exclusion is applied.
127+
128+
For example:
129+
130+
```
131+
[benchmarks]
132+
133+
name metafile
134+
bench1 somedir/bm_bench1
135+
bench2 somedir/bm_bench2
136+
bench3 anotherdir/mybench.toml
137+
138+
[group default]
139+
-bench1
140+
```
141+
142+
This means by default only "bench2" and "bench3" are run.
143+
144+
145+
### Merging Manifests
146+
147+
To combine manifests, use the `[includes]` section in the manifest:
148+
149+
```
150+
[includes]
151+
project1/benchmarks/MANIFEST
152+
project2/benchmarks/MANIFEST
153+
<default>
154+
```
155+
156+
Note that `<default>` is the same as including the manifest file
157+
for the default pyperformance benchmarks.
158+
159+
160+
### A Local Benchmark Suite
161+
162+
Often a project will have more than one benchmark that it will treat
163+
as a suite. `pyperformance` handles this without any extra work.
164+
165+
In the dirctory holding the manifest file put all the benchmarks. Then
166+
put `<local>` in the "metafile" column, like this:
167+
168+
```
169+
[benchmarks]
170+
171+
name metafile
172+
bench1 <local>
173+
bench2 <local>
174+
bench3 <local>
175+
bench4 <local>
176+
bench5 <local>
177+
```
178+
179+
It will look for `DIR/bm_NAME/pyproject.toml`.
180+
181+
If there are also variants, identify the main benchmark
182+
in the "metafile" value, like this:
183+
184+
```
185+
[benchmarks]
186+
187+
name metafile
188+
bench1 <local>
189+
bench2 <local>
190+
bench3 <local>
191+
variant1 <local:bench3>
192+
variant2 <local:bench3>
193+
```
194+
195+
`pyperformance` will look for `DIR/bm_BASE/bm_NAME.toml`, where "BASE"
196+
is the part after "local:".
197+
198+
199+
### A Project's Benchmark Suite
200+
201+
A Python project can identify its benchmark suite by putting the path
202+
to the manifest file in the project's top-level pyproject.toml.
203+
Additional manifests can be identified as well.
204+
205+
```
206+
[tool.pyperformance]
207+
manifest = "..."
208+
manifests = ["...", "..."]
209+
```
210+
211+
(Reminder: that is the pyproject.toml, not the manifest file.)
212+
213+
214+
## Benchmark Metadata Files
215+
216+
A benchmark's metadata file (usually pyproject.toml) follows the format
217+
specified in [PEP 621](https://www.python.org/dev/peps/pep-0621) and
218+
[PEP 518](https://www.python.org/dev/peps/pep-0518). So there are two
219+
supported sections in the file: "project" and "tool.pyperformance".
220+
221+
A typical metadata file will look something like this:
222+
223+
```
224+
[project]
225+
version = "0.9.1"
226+
dependencies = ["pyperf"]
227+
dynamic = ["name"]
228+
229+
[tool.pyperformance]
230+
name = "my_benchmark"
231+
```
232+
233+
A highly detailed one might look like this:
234+
235+
```
236+
[project]
237+
name = "pyperformance_bm_json_dumps"
238+
version = "0.9.1"
239+
description = "A benchmark for json.dumps()"
240+
requires-python = ">=3.8"
241+
dependencies = ["pyperf"]
242+
urls = {repository = "https://github.com/python/pyperformance"}
243+
dynamic = ["version"]
244+
245+
[tool.pyperformance]
246+
name = "json_dumps"
247+
tags = "serialize"
248+
runscript = "bench.py"
249+
datadir = ".data-files/extras"
250+
extra_opts = ["--special"]
251+
```
252+
253+
254+
### Inheritance
255+
256+
For one benchmark to inherit from another (or from common metadata),
257+
the "inherits" field is available:
258+
259+
```
260+
[project]
261+
dependencies = ["pyperf"]
262+
dynamic = ["name", "version"]
263+
264+
[tool.pyperformance]
265+
name = "my_benchmark"
266+
inherits = "../common.toml"
267+
```
268+
269+
All values in either section of the inherited metadata are treated
270+
as defaults, on top of which the current metadata is applied. In the
271+
above example, for instance, a value for "version" in common.toml would
272+
be used here.
273+
274+
If the "inherits" value is a directory (even for "..") then
275+
"base.toml" in that directory will be inherited.
276+
277+
For variants, the base pyproject.toml is the default value for "inherits".
278+
279+
280+
### Inferred Values
281+
282+
In some situations, omitted values will be inferred from other available
283+
data (even for required fields).
284+
285+
* `project.name` <= `tool.pyperformance.name`
286+
* `project.*` <= inherited metadata (except for "name" and "dynamic")
287+
* `tool.pyperformance.name` <= metadata filename
288+
* `tool.pyperformance.*` <= inherited metadata (except for "name" and "inherits")
289+
290+
When the name is inferred from the filename for a regularly structured
291+
benchmark, the "bm\_" prefix is removed from the benchmark's directory.
292+
If it is a variant that prefix is removed from the metadata filename,
293+
as well as the .toml suffix.
294+
295+
296+
### The `[project]` Section
297+
298+
| field | type | R | T | B | D |
299+
|----------------------|-------|---|---|---|---|
300+
| project.name | str | X | X | | |
301+
| project.version | ver | X | | X | X |
302+
| project.dependencies | [str] | | | X | |
303+
| project.dynamic | [str] | | | | |
304+
305+
"R": required
306+
"T": inferred from the tool section
307+
"B": inferred from the inherited metadata
308+
"D": for default benchmarks, inferred from pyperformance
309+
310+
"dynamic" is required by PEP 621 for when a field will be filled in
311+
dynamically by the tool. This is especially important for required
312+
fields.
313+
314+
All other PEP 621 fields are optional (e.g. `requires-python = ">=3.8"`,
315+
`{repository = "https://github.com/..."}`).
316+
317+
318+
### The `[tool.pyperformance]` Section
319+
320+
| field | type | R | B | F |
321+
|-----------------|-------|---|---|---|
322+
| tool.name | str | X | | X |
323+
| tool.tags | [str] | | X | |
324+
| tool.extra_opts | [str] | | X | |
325+
| tool.inherits | file | | | |
326+
| tool.runscript | file | | X | |
327+
| tool.datadir | file | | X | |
328+
329+
"R": required
330+
"B": inferred from the inherited metadata
331+
"F": inferred from filename
332+
333+
* tags: optional list of names to group benchmarks
334+
* extra_opts: optional list of args to pass to `tool.runscript`
335+
* runscript: the benchmark script to use instead of run_benchmark.py.

MANIFEST.in

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,16 @@ include COPYING
33
include MANIFEST.in
44
include README.rst
55
include TODO.rst
6+
include requirements.in
67
include requirements.txt
78
include runtests.py
89
include pyperformance
910
include tox.ini
1011

1112
include doc/*.rst doc/images/*.png doc/images/*.jpg
1213
include doc/conf.py doc/Makefile doc/make.bat
14+
15+
include pyperformance/data-files/requirements.txt
16+
include pyperformance/data-files/benchmarks/MANIFEST
17+
include pyperformance/data-files/benchmarks/base.toml
18+
recursive-include pyperformance/data-files/benchmarks/bm_*/* *

benchmarks

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
pyperformance/data-files/benchmarks

doc/benchmark.conf.sample

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,9 @@ install = True
6464
# Run "sudo python3 -m pyperf system tune" before running benchmarks?
6565
system_tune = True
6666

67+
# --manifest option for 'pyperformance run'
68+
manifest =
69+
6770
# --benchmarks option for 'pyperformance run'
6871
benchmarks =
6972

pyperformance/__init__.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,14 @@
1+
import os.path
2+
3+
14
VERSION = (1, 0, 3)
25
__version__ = '.'.join(map(str, VERSION))
6+
7+
8+
PKG_ROOT = os.path.dirname(__file__)
9+
DATA_DIR = os.path.join(PKG_ROOT, 'data-files')
10+
11+
12+
def is_installed():
13+
parent = os.path.dirname(PKG_ROOT)
14+
return os.path.exists(os.path.join(parent, 'setup.py'))

0 commit comments

Comments
 (0)