Skip to content

_PyPegen_is_memoized() has a complexity of O(n) #93289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vstinner opened this issue May 27, 2022 · 6 comments
Closed

_PyPegen_is_memoized() has a complexity of O(n) #93289

vstinner opened this issue May 27, 2022 · 6 comments
Labels
performance Performance or resource usage type-bug An unexpected behavior, bug, or error

Comments

@vstinner
Copy link
Member

Using Linux perf, I noticed that the Python parser spends a significant time in the _PyPegen_is_memoized() function which iterates on a linked list to find a value:

int  // bool
_PyPegen_is_memoized(Parser *p, int type, void *pres)
{
    if (p->mark == p->fill) {
        if (_PyPegen_fill_token(p) < 0) {
            p->error_indicator = 1;
            return -1;
        }
    }

    Token *t = p->tokens[p->mark];

    for (Memo *m = t->memo; m != NULL; m = m->next) {
        if (m->type == type) {
#if defined(PY_DEBUG)
            if (0 <= type && type < NSTATISTICS) {
                long count = m->mark - p->mark;
                // A memoized negative result counts for one.
                if (count <= 0) {
                    count = 1;
                }
                memo_statistics[type] += count;
            }
#endif
            p->mark = m->mark;
            *(void **)(pres) = m->node;
            return 1;
        }
    }
    return 0;
}

script.py:

import tokenize

# wc -l $(find -name "*.py")|sort -n
large_files = """
    5259 ./Tools/clinic/clinic.py
    5553 ./Lib/test/test_email/test_email.py
    5577 ./Lib/test/test_argparse.py
    5659 ./Lib/test/test_logging.py
    5806 ./Lib/test/test_descr.py
    5878 ./Lib/test/test_decimal.py
    5993 ./Lib/test/_test_multiprocessing.py
    6425 ./Lib/_pydecimal.py
    6626 ./Lib/test/datetimetester.py
    6664 ./Lib/test/test_socket.py
    7325 ./Lib/test/test_typing.py
   15534 ./Lib/pydoc_data/topics.py
"""
large_files = [line.split()[-1] for line in large_files.splitlines() if line.strip()]

files = []
for filename in large_files:
    with tokenize.open(filename) as fp:
        content = fp.read()
    files.append((filename, content))

for loops in range(5):
    for filename, content in files:
        print(filename)
        compile(content, filename, "exec")

Linux perf says that overall, Python spent 7% of its runtime in this function:

$ perf record ./python script.py
$ perf report
Samples: 10K of event 'cycles:u', Event count (approx.): 7308792716
Overhead  Command  Shared Object         Symbol
   7,35%  python   python                [.] _PyPegen_is_memoized
   5,18%  python   python                [.] assemble
   3,80%  python   python                [.] _PyPegen_expect_token
   3,53%  python   python                [.] unicodekeys_lookup_unicode
   3,06%  python   python                [.] tok_get
   2,59%  python   python                [.] _PyPegen_update_memo
   2,27%  python   python                [.] _Py_dict_lookup
   2,19%  python   python                [.] _PyObject_Free
(...)
@vstinner vstinner added type-bug An unexpected behavior, bug, or error performance Performance or resource usage labels May 27, 2022
@vstinner
Copy link
Member Author

@pablogsal @lysnikolaou @isidentical: Would it be possible to maintain a hashtable to make this lookup function faster?

If you consider that it's not worth it, just you can close my issue ;-)

@vstinner
Copy link
Member Author

The benchmark comes from: #93106

@pablogsal
Copy link
Member

@pablogsal @lysnikolaou @isidentical: Would it be possible to maintain a hashtable to make this lookup function faster?

If you consider that it's not worth it, just you can close my issue ;-)

We already tried to use a hash table, a btree, a static array, a dynamic array and a skip list and all of them were slower. If you manage to get a hash table to perform faster, I'm happy to merge it :)

@pablogsal
Copy link
Member

I tried yet again to use the hash table from pycore_hashtable.h and is a performance disaster. I took also some statistics and the linked lists average number of elements is between 1 and 10 values so a hash table is an overkill, either per token or one in the parser. Also I checked and the average percentage of times the function returns true (so a value is found) is around 75%, which means that adding a bloom filter or similar in front is also going to slow down the thing quite a lot.

I'm closing the issue, but if you manage to get a hash table or other structure running and is faster, please feel free to reopen.

@pablogsal
Copy link
Member

@isidentical @lysnikolaou If you want to give it a go, please feel free to try :)

@vstinner
Copy link
Member Author

I tried yet again to use the hash table from pycore_hashtable.h and is a performance disaster.

That's what I would try. I trust you that it's worse ;-) Ok, good that you already knew about it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants