Perf: Improve performance of multithreaded `--check` and `undefined-field` diagnostic #2738

tomlau10 · 2024-07-01T03:05:35Z

Recently I am integrating LuaLs into our project's CI and I am looking for ways to speed up the process. The following are 2 simple optimizations that I have found:

Work distribution in multi-threaded `--check`

I found this recently added multi-threaded --check in #2638. Upon some code reviewing I suspect that the hash function used to distribute work across multi threads is not that perfect, which limits its full potential. I simply changed it to Round Robin (after sorting the file list of course), and this yields another 10% performance gain in my project workspace (500+ lua files). Benchmarks as follow:

CPU: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz

threads	ori-1	ori-2	ori-3	ori-avg	new-1	new-2	new-3	new-avg	diff
1	38.67	38.88	38.76	38.77	38.77	39.35	38.75	38.95667	0%
2	25.57	25.64	25.79	25.66667	22.93	22.89	22.86	22.89333	-11%
4	17.55	17.93	17.41	17.63	15.81	15.22	15.33	15.45333	-12%

I would like to mention @emmericp here as this function is originally implemented by him, to see if this small optimization can help shaving off a few more seconds in his project 😄 (great work btw 👍 )

Early break in `undefined-field` diagnostic

Upon some profiling, the undefined-field check is quite time consuming. By looking into the code I found the early break logic of it can be optimized. The #vm.getDefs(src) > 0 is used to check if there is any definition for a given src object, and there is no need to fetch the full definitions list. We can actually return as soon as the 1st definition is found. This yields a 25% performance gain in this single check within my workspace 🚀

setting "diagnostics.severity": { "undefined-field": "Error!" } in .luarc.json then run with --checklevel=Error
NOTE: The following is run in single thread to avoid confusion with the above threaded optimization.

	1	2	3	avg
ori	15.4	15.39	15.44	15.41
new	11.51	11.26	11.26	11.34333
diff				-26%

The current hash function used to distribute work seems not perfect. We can actually use round robin to distribute after sorting file list.

The current early break only wants to check if there are any definition. There is no need to fetch the full definitions list. We can early break as soon as we found the 1st one.

C3pa · 2024-07-01T06:29:13Z

Very nice tomlau10! Can we get some insight into how you profiled the LuaLS? I am primarily interested in how you found which functions to optimize.

sumneko · 2024-07-01T07:04:02Z

Great work!

tomlau10 · 2024-07-01T07:26:27Z

Can we get some insight into how you profiled the LuaLS?

@C3pa Nothing advanced, I just profile each diagnostics one by one 😂 by setting up the .luarc.json in my workspace like this:

"diagnostics.severity": {
    "undefined-field": "Error!"
}

And then I run LuaLS with --check --checklevel=Error, which will only check against those with severity Error or above.
(In this example setup is the undefined-field)

During my profiling, I also found that the invisible diagnostic is quite time consuming as well (on par with undefined-field check). But currently I have no idea what can be optimized. 😕

edit:

I am primarily interested in how you found which functions to optimize.

Oh~ I guess you are asking how I found that #vm.getDefs(src) > 0 can be optimized? Nothing special, just by some programmer instincts 😄

C3pa · 2024-07-01T07:37:57Z

Interesting. Thanks for your insight! I thought you might have used some kind of profiler setup.

tomlau10 added 3 commits July 1, 2024 09:35

perf: Optimize work distribution in multi-threaded --check

fea3b6d

The current hash function used to distribute work seems not perfect. We can actually use round robin to distribute after sorting file list.

perf: Optimize undefined-field check early break logic

cad34b7

The current early break only wants to check if there are any definition. There is no need to fetch the full definitions list. We can early break as soon as we found the 1st one.

doc: Update changelog

f58d0ba

sumneko merged commit ebbc372 into LuaLS:master Jul 1, 2024
9 of 11 checks passed

tomlau10 mentioned this pull request Mar 9, 2025

fix: respect ignoreDir in --check CLI #3110

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf: Improve performance of multithreaded `--check` and `undefined-field` diagnostic #2738

Perf: Improve performance of multithreaded `--check` and `undefined-field` diagnostic #2738

tomlau10 commented Jul 1, 2024

C3pa commented Jul 1, 2024

sumneko commented Jul 1, 2024

tomlau10 commented Jul 1, 2024 •

edited

Loading

C3pa commented Jul 1, 2024

Perf: Improve performance of multithreaded --check and undefined-field diagnostic #2738

Perf: Improve performance of multithreaded --check and undefined-field diagnostic #2738

Conversation

tomlau10 commented Jul 1, 2024

Work distribution in multi-threaded --check

Early break in undefined-field diagnostic

C3pa commented Jul 1, 2024

sumneko commented Jul 1, 2024

tomlau10 commented Jul 1, 2024 • edited Loading

C3pa commented Jul 1, 2024

Perf: Improve performance of multithreaded `--check` and `undefined-field` diagnostic #2738

Perf: Improve performance of multithreaded `--check` and `undefined-field` diagnostic #2738

Work distribution in multi-threaded `--check`

Early break in `undefined-field` diagnostic

tomlau10 commented Jul 1, 2024 •

edited

Loading