Documentation Index
Fetch the complete documentation index at: https://docs.trainy.ai/llms.txt
Use this file to discover all available pages before exploring further.
Note that some models may require authentication through Hugging Face tokens, which can be done using konduktor secret (see complex example here). The model deepseek-ai/DeepSeek-R1-Distill-Llama-8B does not require one.
Prerequisites
Current Working Directory
Launching
$ konduktor serve launch deployment.yaml
Deployment.yaml
# no autoscaling + default port (8000) + single GPU
name: serving-vllm-simple
resources:
cpus: 4
memory: 32
accelerators: A100:1
image_id: vllm/vllm-openai:v0.7.1
labels:
kueue.x-k8s.io/queue-name: user-queue
serving:
min_replicas: 1
run: |
python3 -m vllm.entrypoints.openai.api_server \
--model deepseek-ai/DeepSeek-R1-Distill-Llama-8B \
--max-model-len 4096