-
Notifications
You must be signed in to change notification settings - Fork 72
[phase closure] sum closure memory opt [ci skip] #338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
log.debug(f"shape of ifgs_breach_count is {ifgs_breach_count.shape}") | ||
log.debug(f"dtype of ifgs_breach_count is {ifgs_breach_count.dtype}") | ||
|
||
mpiops.comm.Reduce(ifgs_breach_count, ifgs_breach_count, op=MPI.SUM, root=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit which changes this to a capital R reduce. This fixes further issues that lower-case reduce was having with creating a 13 GB pickle object to communicate its part of the ifgs_breach_count_arr
back to the root node. Using Reduce
is safe because while mpiops.array_split
may result in different numbers of loops being sent to each process, the accumulated breach count arrays are all numpy arrays of the same size (height of ifg, width of ifg, # of ifgs)
so we can use preallocated buffers for our reduction operation (and the sum over axis 0 is default behaviour of MPI.SUM
so we don't need to mess with a custom sum0_op
). Still a couple of things to change before we can merge this:
- I'm not sure if it's safe to use the same buffer for sending and receiving for the reduction - if not we'll need to create an empty array of zeros
- This commit will break the non-MPI option, because I had to directly import mpi4py in
sum_closure.py
to getMPI.SUM
. Should be easy to fix this by exporting MPI.SUM out of mpiops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good pick up on the Reduce
use here Richard. Thank you.
5e974c1
to
2f287a2
Compare
2f287a2
to
7418e9a
Compare
Thanks Sudipta for your work debugging this PR and fixing up the buffer-based Reduce method. As far as I can tell it is now fully functional, so I'm happy to merge it now - might be worth getting @mcgarth 's opinion first |
memory optimisations during sum closure