-
Notifications
You must be signed in to change notification settings - Fork 301
Improve partition
#677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Am I missing the comparison of the results? Thanks for the analysis. |
What do you mean? |
I was missing it; I understand now. It does look noticeably faster, but I'm not sure if the difference is enough to justify changing one implementation for a generally similar one. Are there people using |
I don't know, but why wouldn't they? Or do you mean something like
instead of this?:
I believe my version also saves memory, as I believe my additional |
ISTM the premise of the current implementation is that the user pred function is slow relative to the overhead added by the two generators. For fast predicates like I like Stefan's proposed improvement because it avoids the creation and destruction of one tuple per element and because the C-speed itertools will give consistent performance relative to using pure Python generators. FWIW, there is a another way to avoid a repeated call to
Note 1: If the returned iterators get consumed in parallel with one another (like we recommend for any use of Note 2: The performance of the filterfalse/filter tools is better when |
@rhettinger Good idea to compare with Python's official recipe as well. I've included it in the first post's benchmark. Already for the docstring's And here's a benchmark with
I'd say the bigger issue with Are filterfalse/filter really faster with lz->func == Py_None || lz->func == (PyObject *)&PyBool_Type Ostensibly that checks
That's time per element (using elements I don't know why Benchmark codefrom timeit import timeit
from statistics import mean, stdev
from itertools import repeat, filterfalse
import sys
def filter_None():
return filter(None, repeat(False, n))
def filter_bool():
return filter(bool, repeat(False, n))
def filterfalse_None():
return filterfalse(None, repeat(True, n))
def filterfalse_bool():
return filterfalse(None, repeat(True, n))
funcs = filter_None, filter_bool, filterfalse_None, filterfalse_bool
n = 10**6
times = {f: [] for f in funcs}
def stats(f):
ts = [t * 1e9 for t in sorted(times[f])[:10]]
return f'{mean(ts):6.2f} ± {stdev(ts):4.2f} ns '
for _ in range(1000):
for f in funcs:
t = timeit(lambda: next(f(), None), number=1) / n
times[f].append(t)
for f in sorted(funcs, key=stats):
print(stats(f), f.__name__)
print(sys.version) |
I marked this as |
Uh oh!
There was an error while loading. Please reload this page.
I propose a new implementation. Benchmark results with the documentation's example predicate
is_odd
on the example'srange(10)
and onrange(10000)
:Benchmark script
My new proposal:
The current more-itertools implementation:
The author @stevecj of the current implementation said that the use of generator expressions isn't intentional:
Python's official recipe:
The text was updated successfully, but these errors were encountered: