Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

search: change to use "spawn" and limit the number of tasks per child #3862

Merged
merged 1 commit into from
Mar 22, 2024

Conversation

flammit
Copy link
Contributor

@flammit flammit commented Mar 21, 2024

also clean up some examples to use main and not initialize resources outside of main

This limits the amount of memory leaked in the child processes. For very large kernels, you might need to limit the tasks per child to even less by setting BEAM_MAX_TASKS_PER_CHILD to a lower number.

For this run PYTHONPATH=. HSA=1 GPUS=6 LATEWINO=1 LATEBEAM=64 BEAM_UPCAST_MAX=16384 BEAM_LOCAL_MAX=512 BEAM_MIN_PROGRESS=5 BEAM_MAX_TASKS_PER_CHILD=32 STEPS=10 DEBUG=2 HALF=1 BS=1536 python3 ./examples/hlb_cifar10.py, the parent process still leaked up to 30GB by the end of a 4.5hr run, but this is not addressed here. The total memory usage peaked at 80GB during one late WINO kernel, so in hindsight, I should've run with lower max tasks.

Tested in HSA and CUDA. Trying to do beam inside of the rocm/pytorch docker container didn't seem to work (hangs) but that might be due to other issues.

also clean up some examples to use __main__ and not initialize
resources outside of main
Copy link
Contributor

Changes

Name                           Lines    Diff    Tokens/Line    Diff
---------------------------  -------  ------  -------------  ------
tinygrad/features/search.py      147      +1           17.0    -0.0


total lines changes: +1

@flammit flammit marked this pull request as draft March 21, 2024 19:10
@flammit flammit marked this pull request as ready for review March 21, 2024 19:54
@chenyuxyz
Copy link
Collaborator

was searching why spawn is better than fork and found this python/cpython#84559. is cpython fork just broken?

anecdotally searching cifar on my 3090 machine killed the machine on master, and i can finished the search with this pr

@geohot
Copy link
Collaborator

geohot commented Mar 22, 2024

Oh sweet, yea spawn is exactly how to fix the weird driver behavior!

@geohot geohot merged commit a26090d into tinygrad:master Mar 22, 2024
20 checks passed
@flammit flammit deleted the search_child_memory branch March 22, 2024 17:38
jaredeh pushed a commit to jaredeh/tinygrad that referenced this pull request Mar 24, 2024
…tinygrad#3862)

also clean up some examples to use __main__ and not initialize
resources outside of main
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants