Skip to main content
Version: Next

Quick Start

This guide walks you through the full flow of logging into NuFi for the first time, deploying an AI model, and testing it, in 5 steps.

Estimated time: about 10 minutes (excluding model loading time)


Prerequisites

Confirm the following before starting the Quick Start.

ConditionHow to check
NuFi installation completeAsk your administrator, or see the Installation Guide
Dashboard accessibleOpen the NuFi dashboard URL in a browser and verify access
Account issuedReceive a login account (email) from your administrator

Step 1: Log In

Dashboard login

Click the ① Login button in the upper right. OAuth-based authentication is supported.

After login, the project selection dropdown in the top header becomes active.


Step 2: Create a Project

A project is the unit of resource isolation. Labs, Servings, and Volumes are all managed per project.

Project creation

  1. Click the ② New Project button in the top header.
  2. Enter the project name.
    • Lowercase letters, digits, and hyphens (-) / at least 3 characters (e.g., my-first-project)
  3. Click the Create button.

The created project is automatically selected in the dropdown.


Step 3: Deploy a Model

In the left sidebar, click Serving.

Serving list

  1. Click the ① Create button.
  2. Fill in Step 1 — Basic information.
    • Service Name: deployment name (e.g., llm-demo)
    • Service Template: the inference framework to use (e.g., vLLM)
  3. Click Next to move to Step 2 — Detailed settings.
    • Image: select the inference server container image
    • Accelerator Type / Count: select the GPU or NPU to use
    • When using vLLM, enter the Huggingface model name in the Model field
  4. Click Next and review the settings in Step 3 — Review & Deploy.
  5. Click the Deploy button.

When you return to the Serving list, the status changes from Pending → Running. Model loading may take several minutes.


Step 4: Test in the Playground

Once the status is Running, click Playground in the left sidebar.

  1. From the dropdown at the top of the panel, select the Serving you just deployed.
  2. Enter a message in the bottom input box and click the Send button.
  3. Watch the model response stream in.

Using the + button you can add panels and place multiple Servings side by side to compare responses.


Step 5: Find the API Endpoint

A deployed model provides an OpenAI-compatible API endpoint.

Serving list

Click the ② Connect button to find the endpoint URL.

The endpoint URL has the form https://<host>/<project>/<deployment-name>. Append the inference framework's API path (e.g., /v1/chat/completions) to call it.

curl -X POST 'https://<deployment-endpoint>/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <token>' \
-d '{
"model": "<model-name>",
"messages": [{"role": "user", "content": "Hello"}]
}'