Skip to main content
Version: 0.1.0

Devices

Administrators can add, edit, and delete accelerator devices directly from the dashboard UI. Built-in devices (gpu, rngd, rbln) are read-only and cannot be edited or deleted.

Admin only

This menu appears only for accounts that have the nufi-admin role.

Prerequisites

Before registering a custom device, the following two components must be installed in the Kubernetes cluster where NuFi is deployed.

You can check whether they are installed with the following commands.

# Verify Device Plugin installation — the custom resource key must appear under the node's Capacity.
kubectl describe node <node-name> | grep -i <resource-key>
# e.g. kubectl describe node <node-name> | grep -i nvidia.com/gpu

# Verify Metric Exporter installation — metrics must be collected by Prometheus.
kubectl get pods -A | grep -i exporter
ComponentRole
Device PluginRegisters custom resources such as nvidia.com/gpu under the Kubernetes node Capacity. Lets NuFi schedule Pods using that resource key.
Metric ExporterCollects Prometheus metrics (e.g., DCGM Exporter) and provides utilization, temperature, power, and memory data to the monitoring dashboard.
Resource Key validation

At device registration, the Resource Key you enter is actually looked up against the cluster node Capacity via kubectl describe node. If the resource does not exist in the cluster, registration is rejected. Register the device only after the Device Plugin is properly installed.


Device list

Click Admin > Devices in the left sidebar.

Admin Devices list

Click a device row in the list to open that device's detail page. Clicking Edit or Delete runs the selected action instead of opening the detail page.

Columns

ColumnDescription
LabelName displayed in the dashboard
Resource KeyKubernetes resource request key (e.g., nvidia.com/gpu)
VendorManufacturer
TypeBuilt-in or Custom (admin-added) badge

Action buttons

ButtonTargetDescription
EditCustom deviceEdit device information
DeleteCustom deviceDelete the device (not allowed for Built-in devices)

Device detail

The device detail page shows the selected device configuration in read-only mode. The page header shows the device name, a Built-in or Custom badge, and a button to return to the list.

Admin Devices detail

Basic information

ItemDescription
Device KeyUnique device key
Resource KeyKubernetes resource request key
Device NameDevice name displayed in the dashboard
Vendor NameManufacturer name
Memory NameLabel for the device memory type

Monitoring settings

ItemDescription
Node LabelPrometheus label key that identifies the node
Device LabelPrometheus label key that identifies the device
Pod Label KeyLabel key that identifies the Pod assigned to the device. Hidden when empty.
Namespace Label KeyLabel key that identifies the Namespace assigned to the device. Hidden when empty.
UtilizationAccelerator utilization metric name
MemoryMemory calculation mode, memory unit, Used metric, and Total or Free metric
TemperatureTemperature metric name. Hidden when empty.
PowerPower metric name. Hidden when empty.

The Memory item displays either the Used & Total or Used & Free badge according to the registered calculation mode, together with the memory unit.


Add / edit a custom device

Click the Create or Edit button on the list page to start a 3-step Wizard.

The example below uses an NVIDIA GPU (nvidia.com/gpu) installed in the cluster.

Step 1. Basic information

FieldDescription
Device KeyUnique device key (lowercase, digits, hyphens). Cannot be changed when editing.
Resource KeyKubernetes node resource key (e.g., nvidia.com/gpu). When you finish typing, cluster availability is auto-checked via kubectl describe node.
Device NameDisplay name shown in the dashboard. Combined with Vendor Name with a space.
Vendor NameManufacturer name. Combined with Device Name with a space.
Memory NameLabel for the device memory type (e.g., VRAM)