xpk icon indicating copy to clipboard operation
xpk copied to clipboard

add placement policy for NAP workload

Open zxhe-sean opened this issue 2 months ago • 3 comments

Description

Support creating new workload policies in xpk workload create. This enables scheduling of different topologies within the same GKE cluster through NAP which is very important for sharing capacity easily across multiple test users.

Issue

https://b.corp.google.com/issues/455642310 NAP previously did not support creating workload policies during workload create and only during cluster create so only just one topology would be supported for that cluster limited NAP to just one topology per cluster AKA making NAP not useful! In that scenario we would just use static clusters.

Testing

Created a tpu7x nap cluster and created workloads of different sizes and saw nodepools created and workloads running.

zxhe-sean avatar Nov 13 '25 21:11 zxhe-sean

🤖 Hi @zxhe-sean, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

github-actions[bot] avatar Nov 13 '25 21:11 github-actions[bot]

@scaliby can you help get https://github.com/AI-Hypercomputer/xpk/pull/828 over the finish line? It was relatively easy to support creating workload policies in xpk workload create but we hacked it by copying some code to avoid a circular dependency in the file imports?

Maybe pull that PR and modify it? We also didn't change scheduler_test.py. Thank you!

Obliviour avatar Nov 14 '25 16:11 Obliviour

@Obliviour we're just behind the corner with subsliicing support, so let's hold on on merging this. This will be supported through sub-slicing and should be possible to be used by you tomorrow.

scaliby avatar Nov 17 '25 11:11 scaliby

Closing as this was addressed in https://github.com/AI-Hypercomputer/xpk/pull/847

scaliby avatar Nov 20 '25 09:11 scaliby