Console
Inference keys
Inference keys authenticate calls to the LLM API. They are created per project and carry their own rate and spend limits.
Creating a key
Under Inference API → Keys, choose a project and create a key. The full secret is shown once at creation — copy it immediately into your secret manager. Afterwards only a masked prefix is visible.
curl -X POST https://api.upgreat.ai/v1/projects/{projectId}/inference/keys \
-H "Authorization: Bearer <access-token>" \
-H "Content-Type: application/json" \
-d '{
"name": "production-backend",
"rpm": 600,
"tpm": 200000,
"token_limit": 5000000,
"token_limit_period": "month",
"allowed_models": ["qwen3.6-27b", "bge-m3"]
}'Scoping a key
Each key can be constrained so a leak or a runaway job is contained:
rpm— maximum requests per minute.tpm— maximum tokens per minute.token_limitwithtoken_limit_period— a hard token budget per minute, hour, day, week or month.allowed_models— restrict the key to specific model IDs. Omit to allow every model the project can access.expires_at— an optional expiry timestamp, after which the key stops working.
One key per workload
Give each service or environment its own key. You get clean per-key usage attribution and can revoke one workload without disrupting the others.Using a key
Send the key to the LLM API as a bearer token. See the LLM quickstart for full examples.
curl https://llm.upgreat.ai/v1/chat/completions \
-H "Authorization: Bearer <inference-key>" \
-H "Content-Type: application/json" \
-d '{ "model": "qwen3.6-27b", "messages": [{"role": "user", "content": "Hi"}] }'Rotating and revoking
To rotate, create a new key, deploy it, then revoke the old one with DELETE /v1/inference/keys/{keyId}. Revocation is immediate. A revoked key can be archived to hide it from the default list, and unarchived later if you need its history back.
Revoke leaked keys at once
If a key is exposed, revoke it immediately in the Console and issue a replacement. Inspect a key's recent activity under its usage view, or withGET /v1/inference/keys/{keyId}/usage.