Version: Next

Serve a Model Downloaded in Lab

Download a Hugging Face model manually from Jupyter Lab, register the files in Model Artifacts, and deploy them as a Serving. Use this path when you need to run download scripts yourself or modify files before registration.

Prerequisites

A Volume for storing model files. If you need a new one, see Volumes.
A Jupyter-based Lab. See Lab.

1. Mount a Volume in Lab

When creating the Lab, add the model storage Volume to Data Volumes and set the mount path to /data.

Field	Example
Name	`tutorial-lab`
Server Type	`Jupyter`
Data Volumes	`tutorial-volume` -> `/data`

Lab creation form

2. Download the Model

When the Lab is Running, click Connect to open Jupyter. Open File > New > Terminal and download the model.

pip install -U huggingface_hub

hf download Qwen/Qwen2.5-0.5B-Instruct \
  --local-dir /data/Qwen2.5-0.5B-Instruct

For private repositories, log in with a token first.

hf login --token $HF_TOKEN

3. Register the Model

In the left sidebar, click Model Artifacts and run Register Model.

Field	Example
Model Name	`qwen-instruct-tutorial`
Version	`v1`
Volume	`tutorial-volume`
Path	`Qwen2.5-0.5B-Instruct`
Format	`SafeTensors`

Add artifact

When validation succeeds, register the model version.

4. Choose a Serving Path

For GPU serving, run Quick Deploy from the model detail page.

For NPU serving, compile the source artifact first in Model Compilations. When compilation reaches Succeeded, run Quick Deploy with the generated NPU artifact.

5. Create the Serving

In the Quick Deploy dialog, confirm the model, version, and artifact, then enter a Serving name.

Field	Example
Service Name	`lab-downloaded-model-serving`
Version	`v1`
Artifact	Source artifact for GPU serving, compiled artifact for NPU serving

The deployment is complete when the Serving status becomes Running.

Next Steps

To check the serving model's response, continue to Test Responses in Playground.

To check device and node metrics, continue to Check Metrics in Monitoring.

Prerequisites​

1. Mount a Volume in Lab​

2. Download the Model​

3. Register the Model​

4. Choose a Serving Path​

5. Create the Serving​

Next Steps​