Skip to main content
Version: Next

What is NuFi?

NuFi is an NPUOps platform that handles NPU serving and operations on top of existing MLOps ecosystems.

When you bring models trained in existing MLOps environments (Kubeflow, MLflow, etc.) into NuFi, everything from NPU porting to serving deployment, operations, and monitoring is handled in a single platform. NuFi does not replace your training infrastructure — it lets you keep your existing MLOps investment while extending into NPU serving.

Through the web dashboard, you can manage Labs (development environments), Serving (model serving), Volumes (storage), and more.

Problems We Solve

  • Putting models trained in existing MLOps onto NPUs requires building a separate quantization and compilation environment
  • After model preparation, additional time is needed to set up the NPU serving environment
  • Strict airgap environments cannot adopt internet-based AI platforms
  • It is difficult to compare, with real data, how power-efficient NPUs actually are versus GPUs
  • When a service fails, it takes time to identify which of many devices is causing the issue
  • Security policies prohibit external AI, making it hard to deploy in-house document-based chatbots

NPUOps Pipeline

Once a model trained in existing MLOps is brought into NuFi, NuFi takes over from NPU porting through serving and operations. With MLflow integration you can pull models directly from the model registry, and because the training and serving environments share the same storage, deployment is immediate — no extra file movement required.

Supported Devices

VendorDevice
NVIDIACUDA-capable GPU
FuriosaAIRNGD

Devices not on the supported list can still be managed in NuFi through manual registration.

Differences from Existing Solutions

NuFi does not replace existing MLOps — it plugs in as the NPU serving layer.

CategoryNuFi (NPUOps)MLOps OSS (Kubeflow, etc.)NPU Vendor Solutions
RoleNPU porting, serving, operationsModel training, experiment trackingNPU serving only
NPU serving supportGPU · RNGDNoneVendor's own chip only
Quantization, compile, NPU portingComplete within the platformSeparate setup requiredServing layer only
Existing MLOps integrationMLflow integrationN/A
NPU vs GPU performance/watt comparisonDirect in-dashboard comparisonNoneNone
Airgap installationSingle bundle, single commandManual setup requiredPartially possible
RBAC · Audit logKeycloak integration, full historySeparate setup requiredLimited