What is NuFi?
NuFi is an NPUOps platform that handles NPU serving and operations on top of existing MLOps ecosystems.
When you bring models trained in existing MLOps environments (Kubeflow, MLflow, etc.) into NuFi, everything from NPU porting to serving deployment, operations, and monitoring is handled in a single platform. NuFi does not replace your training infrastructure — it lets you keep your existing MLOps investment while extending into NPU serving.
Through the web dashboard, you can manage Labs (development environments), Serving (model serving), Volumes (storage), and more.
Problems We Solve
- Putting models trained in existing MLOps onto NPUs requires building a separate quantization and compilation environment
- After model preparation, additional time is needed to set up the NPU serving environment
- Strict airgap environments cannot adopt internet-based AI platforms
- It is difficult to compare, with real data, how power-efficient NPUs actually are versus GPUs
- When a service fails, it takes time to identify which of many devices is causing the issue
- Security policies prohibit external AI, making it hard to deploy in-house document-based chatbots
NPUOps Pipeline
Once a model trained in existing MLOps is brought into NuFi, NuFi takes over from NPU porting through serving and operations. With MLflow integration you can pull models directly from the model registry, and because the training and serving environments share the same storage, deployment is immediate — no extra file movement required.
Supported Devices
| Vendor | Device |
|---|---|
| NVIDIA | CUDA-capable GPU |
| FuriosaAI | RNGD |
Devices not on the supported list can still be managed in NuFi through manual registration.
Differences from Existing Solutions
NuFi does not replace existing MLOps — it plugs in as the NPU serving layer.
| Category | NuFi (NPUOps) | MLOps OSS (Kubeflow, etc.) | NPU Vendor Solutions |
|---|---|---|---|
| Role | NPU porting, serving, operations | Model training, experiment tracking | NPU serving only |
| NPU serving support | GPU · RNGD | None | Vendor's own chip only |
| Quantization, compile, NPU porting | Complete within the platform | Separate setup required | Serving layer only |
| Existing MLOps integration | MLflow integration | — | N/A |
| NPU vs GPU performance/watt comparison | Direct in-dashboard comparison | None | None |
| Airgap installation | Single bundle, single command | Manual setup required | Partially possible |
| RBAC · Audit log | Keycloak integration, full history | Separate setup required | Limited |