ironcore icon indicating copy to clipboard operation
ironcore copied to clipboard

Enable Extensible and Policy-rich Scheduling for Machines

Open hardikdr opened this issue 3 years ago • 1 comments

Summary

This issue aims to enable extensible and policy-rich scheduling for VirtualMachines and MetalMachines in the onmetal-api. The proposed approach is to introduce the capability to plug-in and plug-out custom schedulers in the cluster that can act on specific sets of VirtualMachines.

There are mainly 2 objectives to implement,

  1. To achieve this, the first action item is to introduce the Machine.Spec.SchedulerName API. This will allow users to specify which scheduler should be used to dispatch a particular machine. If not specified, the default scheduler will be used. Similar to the Pod.Spec.SchedulerName API in Kubernetes, this will provide greater flexibility and customization options for scheduling VirtualMachines.

  2. The second action item is to enhance the MachinePool status to include utilization information about the host in terms of CPU, memory, and hugepages, among other parameters. The utilization information will be similar to that of the node API in Kubernetes. This will provide valuable information to the custom schedulers, enabling them to make better scheduling decisions that are more aligned with the needs of the applications running on the VirtualMachines. For example, a custom scheduler could be designed to allocate VirtualMachines to hosts that have more available resources, ensuring that the applications running on those VirtualMachines are not resource-starved and perform optimally.

Basic example

  1. Enhance Machine API:
apiVersion: compute.api.onmetal.de/v1alpha1
kind: Machine
metadata:
  name: machine-hd4
spec:
  schedulerName: default-scheduler
  1. Enhance MachinePool API:
  status:
    allocatable:
      cpu: "48"
      ephemeral-storage: "1416167347928"
      hugepages-1Gi: 400Gi
      hugepages-2Mi: "0"
    capacity:
      cpu: "48"
      ephemeral-storage: 1536639920Ki
      hugepages-1Gi: 400Gi
      hugepages-2Mi: "0"

Motivation

To enable more efficient scheduling for the Machines.

hardikdr avatar Feb 16 '23 10:02 hardikdr

cc @gehoern @adracus

hardikdr avatar Feb 16 '23 10:02 hardikdr