-
Notifications
You must be signed in to change notification settings - Fork 203
feat: add shard execution workflow #1557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add shard execution workflow #1557
Conversation
Could we fully decouple the workflow definition and execution from Nx? Ideally we would have a workflow like this:
And then we pass this to a ProcessExecutor which is completely independent of Nx and tensors. You could also have a Nx executor, but the overall idea is that the Executor should worry about resources and not necessarily tensors (except the resources the tensors are located). |
We talked about this, but just to register here, I believe this is possible, but probably easier to generalize after we have things ready. Each workflow step needs to be able to determine:
All of this is mostly already present in the current code, with the coupling happening mostly on how we identify the dependencies based on data section ids. |
dce2c60
into
pv-feat/experimental-sharding-backend
Adds the initial version of the process communication structure for sharded execution.
Does not handle container outputs for the sharded function yet,
and also does not yet bring everything together into the compiler jit function.