Console

Inference keys

Inference keys authenticate calls to the LLM API. They are created per project and carry their own rate and spend limits.

Creating a key

Under Inference API → Keys, choose a project and create a key. The full secret is shown once at creation — copy it immediately into your secret manager. Afterwards only a masked prefix is visible.

Mint a key

curl -X POST https://api.upgreat.ai/v1/projects/{projectId}/inference/keys \
  -H "Authorization: Bearer <access-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "production-backend",
    "rpm": 600,
    "tpm": 200000,
    "token_limit": 5000000,
    "token_limit_period": "month",
    "allowed_models": ["qwen3.6-27b", "bge-m3"]
  }'

Scoping a key

Each key can be constrained so a leak or a runaway job is contained:

rpm — maximum requests per minute.
tpm — maximum tokens per minute.
token_limit with token_limit_period — a hard token budget per minute, hour, day, week or month.
allowed_models — restrict the key to specific model IDs. Omit to allow every model the project can access.
expires_at — an optional expiry timestamp, after which the key stops working.

One key per workload

Give each service or environment its own key. You get clean per-key usage attribution and can revoke one workload without disrupting the others.

Using a key

Send the key to the LLM API as a bearer token. See the LLM quickstart for full examples.

bash

curl https://llm.upgreat.ai/v1/chat/completions \
  -H "Authorization: Bearer <inference-key>" \
  -H "Content-Type: application/json" \
  -d '{ "model": "qwen3.6-27b", "messages": [{"role": "user", "content": "Hi"}] }'

Rotating and revoking

To rotate, create a new key, deploy it, then revoke the old one with DELETE /v1/inference/keys/{keyId}. Revocation is immediate. A revoked key can be archived to hide it from the default list, and unarchived later if you need its history back.

Revoke leaked keys at once

If a key is exposed, revoke it immediately in the Console and issue a replacement. Inspect a key's recent activity under its usage view, or with GET /v1/inference/keys/{keyId}/usage.