Configuration Guide
LLM Gateway configuration is centered on credentials, model routing metadata, fallback configuration, observability settings, and tenant isolation. Configure the gateway once, then point applications at the OpenAI-compatible endpoint.
Credential model
Each credential represents one usable provider connection. A credential can include:
- Provider name.
- Display name.
- Encrypted API key or provider-specific secret material.
- Default model.
- Load balancing weight.
- Enabled or disabled state.
- Custom base URL for compatible or self-hosted providers.
- Provider-specific configuration such as Azure deployment settings, Bedrock region, or Vertex project details.
Credentials are scoped to the organization and user context in Axiom Cloud. They are not shared across tenants.
Provider setup checklist
Before adding a provider, decide:
- Which environment the credential serves: development, staging, production, or regional workload.
- Which model names applications will request.
- Whether this credential is the primary route, a weighted peer, or a fallback target.
- Who owns rotation and billing review for the provider account.
- Whether the provider has regional, compliance, or data residency constraints.
OpenAI-compatible providers
For OpenAI-compatible providers, configure:
| Field | Purpose |
|---|---|
| Provider | Select the upstream provider. |
| Name | Human-readable identifier shown in the UI and analytics. |
| API key | Secret used to authenticate with the provider. |
| Default model | Preferred model for routing and load balancing groups. |
| Weight | Traffic share for this provider and model group. |
| Base URL | Optional custom endpoint for compatible or self-hosted APIs. |
Applications call the gateway, not the provider directly:
https://cloud.axiomstudio.ai/rest/v1/llm-gateway/v1/
Azure OpenAI
Azure OpenAI commonly uses deployment names instead of public model names. Configure:
- Azure endpoint, such as
https://your-resource.openai.azure.com. - Deployment name.
- API version.
- Azure OpenAI key.
When applications send requests through the gateway, use the Azure deployment name in the model field unless your Axiom configuration maps it differently.
AWS Bedrock
AWS Bedrock credentials require:
- AWS region.
- Access key ID and secret access key, or runtime role credentials where supported by the deployment.
- Model access enabled in the AWS account.
For production, prefer scoped IAM permissions that allow only the required Bedrock invoke actions for approved models.
Google Vertex AI
Vertex AI setup typically requires:
- Google Cloud project.
- Location or region.
- Service account credentials.
- Model identifiers approved for the workload.
Keep service account permissions scoped to the models and regions your organization intends to use.
Custom and self-hosted providers
Use custom base URLs for OpenAI-compatible providers, internal model gateways, or self-hosted runtimes. Recommended fields:
- Provider name or compatible provider type.
- Base URL.
- API key or bearer token placeholder required by the upstream.
- Default model.
- Timeout settings, if exposed in your deployment.
Validate custom providers with a low-risk test prompt before enabling production routing.
Session and API access
Customer examples use Axiom Cloud session authentication:
curl -X GET "https://cloud.axiomstudio.ai/rest/v1/llm-gateway/credentials" \
-H "Cookie: session=your-session-token"
SDK clients can use the session token as the OpenAI SDK api_key value while the base_url points to Axiom:
client = OpenAI(
base_url="https://cloud.axiomstudio.ai/rest/v1/llm-gateway/v1/",
api_key="your-session-token",
)
Never embed real session tokens or provider keys in source code. Use your application's secret manager.
Metrics token
The metrics endpoint uses a separate bearer token. Operators can generate it from the LLM Gateway settings area or through the metrics token API.
Store the token securely and use it only for scraping:
scrape_configs:
- job_name: axiomcloud-ai-gateway
scheme: https
bearer_token_file: /etc/prometheus/axiomcloud-token
static_configs:
- targets: ["axiomcloud.example.com"]
metrics_path: /metrics/ai-gateway
Configuration review
Review this checklist before production traffic:
- At least one primary credential is enabled.
- Critical workloads have a fallback path.
- Weights add up to the intended traffic split for each provider and model group.
- Provider keys are stored only in Axiom and your secret management process.
- Metrics token is generated and installed in the monitoring stack.
- The first test requests appear in analytics and metrics.