Documentation Index
Fetch the complete documentation index at: https://docs.trainy.ai/llms.txt
Use this file to discover all available pages before exploring further.
This example also demonstrates the creation and use of --kind=env secrets using konduktor secret create. This is required for some models such as meta-llama/Meta-Llama-3.1-8B-Instruct, which require Hugging Face tokens for authentication.
Prerequisites
Setup
- Create a
--kind=env secret for your HF token called my-hf-token
$ konduktor secret create --kind=env --inline HUGGING_FACE_HUB_TOKEN=hf_ABC123 my-hf-token
- Check that the secret was properly created with:
For more details, check out the setup of secrets here.
Current Working Directory
Launching
$ konduktor serve launch deployment.yaml
Deployment.yaml
# autoscaling + custom port + multi GPU
name: serving-vllm-complex
resources:
cpus: 4
memory: 32
accelerators: A100:2
image_id: vllm/vllm-openai:v0.7.1
labels:
kueue.x-k8s.io/queue-name: user-queue
serving:
min_replicas: 0
max_replicas: 2
ports: 9000
run: |
python3 -m vllm.entrypoints.openai.api_server \
--uvicorn-log-level warning \
--model meta-llama/Meta-Llama-3.1-8B-Instruct \
--max-model-len 8192 \
--tensor-parallel-size 2 \
--dtype half