Skip to content

Side-effect free & lazy multiprocessing context #5475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

m1so
Copy link

@m1so m1so commented Oct 28, 2021

util._initialize_mp_context/util.mp_context (or basically importing anything from distributed) imports numpy (additional package) as a side-effect, which increases the memory usage by roughly ~100MiB. This change defers the creation of the multiprocessing context from "import time" to "runtime"

Related to PrefectHQ/prefect#5087 (comment)

  • Closes #xxxx
  • Tests added / passed
  • Passes pre-commit run --all-files

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

@jakirkham
Copy link
Member

ok to test

@m1so m1so force-pushed the lazy-mp-context branch from e3cab4e to bf51050 Compare January 2, 2022 14:18
@m1so
Copy link
Author

m1so commented Jan 2, 2022

@jakirkham @mrocklin @martindurant could you please have a look at this? Seems like these are remnants of using "forkserver" multiprocessing context (#687) and side-effects caused by importing modules (#2627).

Reworked and rebased the PR to:

  1. disable forkserver related initialization unless explicitly set in config (saving some memory by not import forkserver related objects)
  2. checking for package existence instead of importing the modules directly (again saving memory in case the application only imports distributed.Client)

Slack conversation for reference: https://dask.slack.com/archives/C019Q0QSD4J/p1635504992026200

@martindurant
Copy link
Member

If I remember this issue clearly, my point was about setting up things like event loops and server components - but I think @jcrist fixed the specific point I found. My memory is a little hazy here. I don't really know the ramifications of preloading modules for forkserver; but only doing so for the case that we actually use the forkserver seems sensible.

@m1so m1so requested a review from fjetter as a code owner January 23, 2024 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants