config_settings | liteLLM (original) (raw)

ACTIONS_ID_TOKEN_REQUEST_TOKEN

Token for requesting ID in GitHub Actions

ACTIONS_ID_TOKEN_REQUEST_URL

URL for requesting ID token in GitHub Actions

AGENTOPS_ENVIRONMENT

Environment for AgentOps logging integration

AGENTOPS_API_KEY

API Key for AgentOps logging integration

AGENTOPS_SERVICE_NAME

Service Name for AgentOps logging integration

AISPEND_ACCOUNT_ID

Account ID for AI Spend

AISPEND_API_KEY

API Key for AI Spend

AIOHTTP_CONNECTOR_LIMIT

Connection limit for aiohttp connector. When set to 0, no limit is applied. Default is 0

AIOHTTP_CONNECTOR_LIMIT_PER_HOST

Connection limit per host for aiohttp connector. When set to 0, no limit is applied. Default is 0

AIOHTTP_KEEPALIVE_TIMEOUT

Keep-alive timeout for aiohttp connections in seconds. Default is 120

AIOHTTP_SO_KEEPALIVE

Enable TCP SO_KEEPALIVE on aiohttp sockets so idle provider connections are detected and reaped before NAT/load balancers silently drop them. Default is False

AIOHTTP_TCP_KEEPCNT

Number of unacknowledged TCP keepalive probes before the connection is considered dead (applies when AIOHTTP_SO_KEEPALIVE=True). Default is 5

AIOHTTP_TCP_KEEPIDLE

Seconds an aiohttp TCP connection must be idle before keepalive probes are sent (applies when AIOHTTP_SO_KEEPALIVE=True). Default is 60

AIOHTTP_TCP_KEEPINTVL

Seconds between successive aiohttp TCP keepalive probes (applies when AIOHTTP_SO_KEEPALIVE=True). Default is 30

AIOHTTP_TRUST_ENV

Flag to enable aiohttp trust environment. When this is set to True, aiohttp will respect HTTP(S)_PROXY env vars. Default is False

AIOHTTP_TTL_DNS_CACHE

DNS cache time-to-live for aiohttp in seconds. Default is 300

AKTO_GUARDRAIL_API_BASE

Base URL for the Akto Guardrail API (e.g. http://localhost:9090). Used by the Akto guardrail integration.

AKTO_API_KEY

API key for authenticating with the Akto Guardrail service.

ALLOWED_EMAIL_DOMAINS

List of email domains allowed for access

APSCHEDULER_COALESCE

Whether to combine multiple pending executions of a job into one. Default is False

APSCHEDULER_MAX_INSTANCES

Maximum number of concurrent instances of each job. Default is 1

APSCHEDULER_MISFIRE_GRACE_TIME

Grace time in seconds for misfired jobs. Default is 1

APSCHEDULER_REPLACE_EXISTING

Whether to replace existing jobs with the same ID. Default is False

ARIZE_API_KEY

API key for Arize platform integration

ARIZE_SPACE_KEY

Space key for Arize platform

ARGILLA_BATCH_SIZE

Batch size for Argilla logging

ARGILLA_API_KEY

API key for Argilla platform

ARGILLA_SAMPLING_RATE

Sampling rate for Argilla logging

ARGILLA_DATASET_NAME

Dataset name for Argilla logging

ARGILLA_BASE_URL

Base URL for Argilla service

ATHINA_API_KEY

API key for Athina service

ATHINA_BASE_URL

Base URL for Athina service (defaults to https://log.athina.ai)

AUTH_STRATEGY

Strategy used for authentication (e.g., OAuth, API key)

AUTO_REDIRECT_UI_LOGIN_TO_SSO

Flag to enable automatic redirect of UI login page to SSO when SSO is configured. Default is false

AUDIO_SPEECH_CHUNK_SIZE

Chunk size for audio speech processing. Default is 1024

ANTHROPIC_API_KEY

API key for Anthropic service. Uses x-api-key header for authentication.

ANTHROPIC_AUTH_TOKEN

Alternative auth token for Anthropic service. Uses Authorization: Bearer header instead of x-api-key. Used as fallback when ANTHROPIC_API_KEY is not set.

ANTHROPIC_API_BASE

Base URL for Anthropic API. Default is https://api.anthropic.com

ANTHROPIC_BASE_URL

Alternative to ANTHROPIC_API_BASE for setting the Anthropic API base URL. Used as fallback when ANTHROPIC_API_BASE is not set.

ANTHROPIC_TOKEN_COUNTING_BETA_VERSION

Beta version header for Anthropic token counting API. Default is token-counting-2024-11-01

AWS_ACCESS_KEY_ID

Access Key ID for AWS services

AWS_BATCH_ROLE_ARN

ARN of the AWS IAM role for batch operations

AWS_DEFAULT_REGION

Default AWS region for service interactions when AWS_REGION is not set

AWS_PROFILE_NAME

AWS CLI profile name to be used

AWS_REGION

AWS region for service interactions (takes precedence over AWS_DEFAULT_REGION)

AWS_REGION_NAME

Default AWS region for service interactions

AWS_ROLE_ARN

ARN of the AWS IAM role to assume for authentication

AWS_ROLE_NAME

Role name for AWS IAM usage

AWS_S3_BUCKET_NAME

Name of the AWS S3 bucket for file operations

AWS_S3_OUTPUT_BUCKET_NAME

Name of the AWS S3 output bucket for batch operations

AWS_SECRET_ACCESS_KEY

Secret Access Key for AWS services

AWS_SESSION_NAME

Name for AWS session

AWS_WEB_IDENTITY_TOKEN

Web identity token for AWS

AWS_WEB_IDENTITY_TOKEN_FILE

Path to file containing web identity token for AWS

AZURE_API_VERSION

Version of the Azure API being used

AZURE_AI_API_BASE

Base URL for Azure AI services (e.g., Azure AI Anthropic)

AZURE_AI_API_KEY

API key for Azure AI services (e.g., Azure AI Anthropic)

AZURE_AUTHORITY_HOST

Azure authority host URL

AZURE_CERTIFICATE_PASSWORD

Password for Azure OpenAI certificate

AZURE_CLIENT_ID

Client ID for Azure services

AZURE_CLIENT_SECRET

Client secret for Azure services

AZURE_COMPUTER_USE_INPUT_COST_PER_1K_TOKENS

Input cost per 1K tokens for Azure Computer Use service

AZURE_COMPUTER_USE_OUTPUT_COST_PER_1K_TOKENS

Output cost per 1K tokens for Azure Computer Use service

AZURE_DEFAULT_RESPONSES_API_VERSION

Version of the Azure Default Responses API being used. Default is "preview"

AZURE_DOCUMENT_INTELLIGENCE_API_VERSION

API version for Azure Document Intelligence service

AZURE_DOCUMENT_INTELLIGENCE_DEFAULT_DPI

Default DPI (dots per inch) setting for Azure Document Intelligence service

AZURE_TENANT_ID

Tenant ID for Azure Active Directory

AZURE_USERNAME

Username for Azure services, use in conjunction with AZURE_PASSWORD for azure ad token with basic username/password workflow

AZURE_PASSWORD

Password for Azure services, use in conjunction with AZURE_USERNAME for azure ad token with basic username/password workflow

AZURE_FEDERATED_TOKEN_FILE

File path to Azure federated token

AZURE_FILE_SEARCH_COST_PER_GB_PER_DAY

Cost per GB per day for Azure File Search service

AZURE_SCOPE

For EntraID Auth, Scope for Azure services, defaults to "https://cognitiveservices.azure.com/.default"

AZURE_SENTINEL_DCR_IMMUTABLE_ID

Immutable ID of the Data Collection Rule for Azure Sentinel logging

AZURE_SENTINEL_STREAM_NAME

Stream name for Azure Sentinel logging

AZURE_SENTINEL_CLIENT_SECRET

Client secret for Azure Sentinel authentication

AZURE_SENTINEL_ENDPOINT

Endpoint for Azure Sentinel logging

AZURE_SENTINEL_TENANT_ID

Tenant ID for Azure Sentinel authentication

AZURE_SENTINEL_CLIENT_ID

Client ID for Azure Sentinel authentication

AZURE_KEY_VAULT_URI

URI for Azure Key Vault

AZURE_OPERATION_POLLING_TIMEOUT

Timeout in seconds for Azure operation polling

AZURE_STORAGE_ACCOUNT_KEY

The Azure Storage Account Key to use for Authentication to Azure Blob Storage logging

AZURE_STORAGE_ACCOUNT_NAME

Name of the Azure Storage Account to use for logging to Azure Blob Storage

AZURE_STORAGE_FILE_SYSTEM

Name of the Azure Storage File System to use for logging to Azure Blob Storage. (Typically the Container name)

AZURE_STORAGE_TENANT_ID

The Application Tenant ID to use for Authentication to Azure Blob Storage logging

AZURE_STORAGE_CLIENT_ID

The Application Client ID to use for Authentication to Azure Blob Storage logging

AZURE_STORAGE_CLIENT_SECRET

The Application Client Secret to use for Authentication to Azure Blob Storage logging

AZURE_VECTOR_STORE_COST_PER_GB_PER_DAY

Cost per GB per day for Azure Vector Store service

BACKGROUND_HEALTH_CHECK_MAX_TOKENS

Optional global default for max_tokens on proxy background health checks when a model has no health_check_max_tokens. If unset, non-wildcard models default to 5. Applies to wildcard routes when set. Default is unset

BACKGROUND_HEALTH_CHECK_MAX_TOKENS_REASONING

For non-wildcard reasoning models (supports_reasoning(model)=true), this takes precedence over BACKGROUND_HEALTH_CHECK_MAX_TOKENS when set. If unset, reasoning models fall back to BACKGROUND_HEALTH_CHECK_MAX_TOKENS (if set) or default behavior. Wildcard routes ignore this. Default is unset

BATCH_STATUS_POLL_INTERVAL_SECONDS

Interval in seconds for polling batch status. Default is 3600 (1 hour)

BATCH_STATUS_POLL_MAX_ATTEMPTS

Maximum number of attempts for polling batch status. Default is 24 (for 24 hours)

BEDROCK_MAX_POLICY_SIZE

Maximum size for Bedrock policy. Default is 75

BEDROCK_MIN_THINKING_BUDGET_TOKENS

Minimum thinking budget in tokens for Bedrock reasoning models. Bedrock returns a 400 error if budget_tokens is below this value. Requests with lower values are clamped to this minimum. Default is 1024

BERRISPEND_ACCOUNT_ID

Account ID for BerriSpend service

BRAINTRUST_API_KEY

API key for Braintrust integration

BRAINTRUST_API_BASE

Base URL for Braintrust API. Default is https://api.braintrustdata.com/v1

BRAINTRUST_MOCK

Enable mock mode for Braintrust integration testing. When set to true, intercepts Braintrust API calls and returns mock responses without making actual network calls. Default is false

BRAINTRUST_MOCK_LATENCY_MS

Mock latency in milliseconds for Braintrust API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms

CACHED_STREAMING_CHUNK_DELAY

Delay in seconds for cached streaming chunks. Default is 0.02

CHATGPT_API_BASE

Base URL for ChatGPT API. Default is https://chatgpt.com/backend-api/codex

CHATGPT_AUTH_FILE

Filename for ChatGPT authentication data. Default is "auth.json"

CHATGPT_DEFAULT_INSTRUCTIONS

Default system instructions for ChatGPT provider

CHATGPT_ORIGINATOR

Originator identifier for ChatGPT API requests. Default is "codex_cli_rs"

CHATGPT_TOKEN_DIR

Directory to store ChatGPT authentication tokens. Default is "~/.config/litellm/chatgpt"

CHATGPT_USER_AGENT

Custom user agent string for ChatGPT API requests

CHATGPT_USER_AGENT_SUFFIX

Suffix to append to the ChatGPT user agent string

CIRCLE_OIDC_TOKEN

OpenID Connect token for CircleCI

CIRCLE_OIDC_TOKEN_V2

Version 2 of the OpenID Connect token for CircleCI

CLI_JWT_EXPIRATION_HOURS

Expiration time in hours for CLI-generated JWT tokens. Default is 24 hours. Can also be set via LITELLM_CLI_JWT_EXPIRATION_HOURS

CLI_SSO_CLAIM_MAP

Comma-separated allowlist mapping OIDC claim paths to LiteLLM user metadata keys for CLI SSO (e.g. employment_type->acme_employment_type,org_info.department->department). Scalar values are also returned in /sso/cli/poll as attribution_metadata. Alias: LITELLM_CLI_SSO_CLAIM_MAP

CLOUDZERO_API_KEY

CloudZero API key for authentication

CLOUDZERO_CONNECTION_ID

CloudZero connection ID for data submission

CLOUDZERO_EXPORT_INTERVAL_MINUTES

Interval in minutes for CloudZero data export operations

CLOUDZERO_MAX_FETCHED_DATA_RECORDS

Maximum number of data records to fetch from CloudZero

CLOUDZERO_TIMEZONE

Timezone for date handling (default: UTC)

CONFIG_FILE_PATH

File path for configuration file

CYBERARK_ACCOUNT

CyberArk account name for secret management

CYBERARK_API_BASE

Base URL for CyberArk API

CYBERARK_API_KEY

API key for CyberArk secret management service

CYBERARK_CLIENT_CERT

Path to client certificate for CyberArk authentication

CYBERARK_CLIENT_KEY

Path to client key for CyberArk authentication

CYBERARK_USERNAME

Username for CyberArk authentication

CYBERARK_SSL_VERIFY

Flag to enable or disable SSL certificate verification for CyberArk. Default is True

CONFIDENT_API_KEY

API key for DeepEval integration

CUSTOM_TIKTOKEN_CACHE_DIR

Custom directory for Tiktoken cache

CONFIDENT_API_KEY

API key for Confident AI (Deepeval) Logging service

COHERE_API_BASE

Base URL for Cohere API. Default is https://api.cohere.com

COMPETITOR_LLM_TEMPERATURE

Temperature setting for the LLM used in competitor discovery. Default is 0.3

CURSOR_API_BASE

API base URL for Cursor AI provider integration. Default is https://api.cursor.com

DATABASE_HOST

Hostname for the database server

DATABASE_HOST_READ_REPLICA

Hostname for the read-replica database server. Only used by the componentized deployment (experimental) when IAM_TOKEN_DB_AUTH=True to assemble DATABASE_URL_READ_REPLICA from RDS IAM env vars

DATABASE_NAME

Name of the database

DATABASE_NAME_READ_REPLICA

Database name for the read replica (defaults to DATABASE_NAME). Only used by the componentized deployment (experimental) when IAM_TOKEN_DB_AUTH=True

DATABASE_PASSWORD

Password for the database user

DATABASE_PORT

Port number for database connection

DATABASE_PORT_READ_REPLICA

Port number for the read replica (default 5432). Only used by the componentized deployment (experimental) when IAM_TOKEN_DB_AUTH=True

DATABASE_SCHEMA

Schema name used in the database

DATABASE_SCHEMA_READ_REPLICA

Schema name for the read replica (defaults to DATABASE_SCHEMA). Only used by the componentized deployment (experimental) when IAM_TOKEN_DB_AUTH=True

DATABASE_URL

Connection URL for the database

DATABASE_URL_READ_REPLICA

Optional read-replica connection URL. When set, the proxy routes read-only queries (find_*, count, group_by, query_raw/_first) to this endpoint while writes continue to use DATABASE_URL. Useful for Aurora-style clusters with separate reader/writer endpoints. Falls back to writer-only behavior when unset. With IAM_TOKEN_DB_AUTH=True, the reader IAM token is auto-refreshed alongside the writer

DATABASE_USER

Username for database connection

DATABASE_USER_READ_REPLICA

Database user for the read replica (defaults to DATABASE_USER). Only used by the componentized deployment (experimental) when IAM_TOKEN_DB_AUTH=True

DATABASE_USERNAME

Alias for database user

DATABRICKS_API_BASE

Base URL for Databricks API

DATABRICKS_API_KEY

API key (Personal Access Token) for Databricks API authentication

DATABRICKS_CLIENT_ID

Client ID for Databricks OAuth M2M authentication (Service Principal application ID)

DATABRICKS_CLIENT_SECRET

Client secret for Databricks OAuth M2M authentication

DATABRICKS_USER_AGENT

Custom user agent string for Databricks API requests. Used for partner telemetry attribution

DAYS_IN_A_MONTH

Days in a month for calculation purposes. Default is 28

DAYS_IN_A_WEEK

Days in a week for calculation purposes. Default is 7

DAYS_IN_A_YEAR

Days in a year for calculation purposes. Default is 365

DRAIN_ENDPOINT_TOKEN

Shared secret required on the X-Drain-Token header to call the /health/drain endpoint. When set (here or via general_settings.drain_endpoint_token), drain calls without the matching token are rejected with 401; when unset the endpoint keeps its opt-in-only behavior. Have the kubelet send it from the preStop httpGet.httpHeaders.

DYNAMOAI_API_KEY

API key for DynamoAI Guardrails service

DYNAMOAI_API_BASE

Base URL for DynamoAI API. Default is https://api.dynamo.ai

DYNAMOAI_MODEL_ID

Model ID for DynamoAI tracking/logging purposes

DYNAMOAI_POLICY_IDS

Comma-separated list of DynamoAI policy IDs to apply

DD_BASE_URL

Base URL for Datadog integration

DATADOG_BASE_URL

(Alternative to DD_BASE_URL) Base URL for Datadog integration

_DATADOG_BASE_URL

(Alternative to DD_BASE_URL) Base URL for Datadog integration

DD_AGENT_HOST

Hostname or IP of DataDog agent (e.g., "localhost"). When set, logs are sent to agent instead of direct API

DD_AGENT_PORT

Port of DataDog agent for log intake. Default is 10518

DD_API_KEY

API key for Datadog integration

DD_APP_KEY

Application key for Datadog Cost Management integration. Required along with DD_API_KEY for cost metrics

DD_BATCH_SIZE

Number of log events buffered before flushing to Datadog. Clamped to [1, 1000]; defaults to 1000. Lower it (e.g. 50) if batches exceed Datadog's 5MB request limit

DD_SITE

Site URL for Datadog (e.g., datadoghq.com)

DD_SOURCE

Source identifier for Datadog logs

DD_TRACER_STREAMING_CHUNK_YIELD_RESOURCE

Resource name for Datadog tracing of streaming chunk yields. Default is "streaming.chunk.yield"

DD_ENV

Environment identifier for Datadog logs. Only supported for datadog_llm_observability callback

DD_LLMOBS_ML_APP

Default ml_app name for Datadog LLM Observability (Application column). Falls back to DD_SERVICE. Can be overridden per-request via metadata.ml_app.

DD_SERVICE

Service identifier for Datadog logs. Defaults to "litellm-server"

DD_VERSION

Version identifier for Datadog logs. Defaults to "unknown"

DATADOG_MOCK

Enable mock mode for Datadog integration testing. When set to true, intercepts Datadog API calls and returns mock responses without making actual network calls. Default is false

DATADOG_MOCK_LATENCY_MS

Mock latency in milliseconds for Datadog API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms

DEBUG_OTEL

Enable debug mode for OpenTelemetry

DEFAULT_ALLOWED_FAILS

Maximum failures allowed before cooling down a model. Default is 3

DEFAULT_A2A_AGENT_TIMEOUT

Default timeout in seconds for A2A (Agent-to-Agent) protocol requests. Default is 6000

DEFAULT_ACCESS_GROUP_CACHE_TTL

Time-to-live in seconds for cached access group information. Default is 600 (10 minutes)

DEFAULT_ANTHROPIC_CHAT_MAX_TOKENS

Default maximum tokens for Anthropic chat completions. Default is 4096

DEFAULT_BATCH_SIZE

Default batch size for operations. Default is 512

DEFAULT_CHUNK_OVERLAP

Default chunk overlap for RAG text splitters. Default is 200

DEFAULT_CHUNK_SIZE

Default chunk size for RAG text splitters. Default is 1000

DEFAULT_CLIENT_DISCONNECT_CHECK_TIMEOUT_SECONDS

Timeout in seconds for checking client disconnection. Default is 1

DEFAULT_COOLDOWN_TIME_SECONDS

Duration in seconds to cooldown a model after failures. Default is 5

DEFAULT_CRON_JOB_LOCK_TTL_SECONDS

Time-to-live for cron job locks in seconds. Default is 60 (1 minute)

DEFAULT_DATAFORSEO_LOCATION_CODE

Default location code for DataForSEO search API. Default is 2250 (France)

DEFAULT_FAILURE_THRESHOLD_PERCENT

Threshold percentage of failures to cool down a deployment. Default is 0.5 (50%)

DEFAULT_FAILURE_THRESHOLD_MINIMUM_REQUESTS

Minimum number of requests before applying error rate cooldown. Prevents cooldown from triggering on first failure. Default is 5

DEFAULT_FLUSH_INTERVAL_SECONDS

Default interval in seconds for flushing operations. Default is 5

DEFAULT_HEALTH_CHECK_INTERVAL

Default interval in seconds for health checks. Default is 300 (5 minutes)

DEFAULT_HEALTH_CHECK_PROMPT

Default prompt used during health checks for non-image models. Default is "test from litellm"

DEFAULT_IMAGE_HEIGHT

Default height for images. Default is 300

DEFAULT_IMAGE_TOKEN_COUNT

Default token count for images. Default is 250

DEFAULT_IMAGE_WIDTH

Default width for images. Default is 300

DEFAULT_IN_MEMORY_TTL

Default time-to-live for in-memory cache in seconds. Default is 5

DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL

Default time-to-live in seconds for management objects (User, Team, Key, Organization) in memory cache. Default is 60 seconds.

DEFAULT_MAX_LRU_CACHE_SIZE

Default maximum size for LRU cache. Default is 64

DEFAULT_MAX_RECURSE_DEPTH

Default maximum recursion depth. Default is 100

DEFAULT_MAX_RECURSE_DEPTH_SENSITIVE_DATA_MASKER

Default maximum recursion depth for sensitive data masker. Default is 10

DEFAULT_MAX_RETRIES

Default maximum retry attempts. Default is 2

DEFAULT_MAX_TOKENS

Default maximum tokens for LLM calls. Default is 4096

DEFAULT_MAX_TOKENS_FOR_TRITON

Default maximum tokens for Triton models. Default is 2000

DEFAULT_MAX_REDIS_BATCH_CACHE_SIZE

Default maximum size for redis batch cache. Default is 1000

DEFAULT_MCP_SEMANTIC_FILTER_EMBEDDING_MODEL

Default embedding model for MCP semantic tool filtering. Default is "text-embedding-3-small"

DEFAULT_MCP_SEMANTIC_FILTER_SIMILARITY_THRESHOLD

Default similarity threshold for MCP semantic tool filtering. Default is 0.3

DEFAULT_MCP_SEMANTIC_FILTER_TOP_K

Default number of top results to return for MCP semantic tool filtering. Default is 10

MCP_NPM_CACHE_DIR

Directory for npm cache used by STDIO MCP servers. In containers the default (~/.npm) may not exist or be read-only. Default is /tmp/.npm_mcp_cache

LITELLM_MCP_CLIENT_TIMEOUT

MCP client connection timeout in seconds (stdio and HTTP/SSE transports). Default is 60

LITELLM_MCP_TOOL_LISTING_TIMEOUT

Timeout in seconds for listing tools from an MCP server. Default is 30

LITELLM_MCP_METADATA_TIMEOUT

HTTP client timeout in seconds for OAuth metadata fetching. Default is 10

LITELLM_MCP_HEALTH_CHECK_TIMEOUT

Health check timeout in seconds for MCP servers. Default is 10

LITELLM_MCP_STDIO_EXTRA_COMMANDS

Comma-separated extra command basenames allowed for MCP stdio transport beyond the built-in allowlist. Example: my-mcp-bin. Empty by default

MCP_OAUTH2_TOKEN_CACHE_DEFAULT_TTL

Default TTL in seconds for MCP OAuth2 token cache. Default is 3600

MCP_OAUTH2_TOKEN_CACHE_MAX_SIZE

Maximum number of entries in MCP OAuth2 token cache. Default is 200

MCP_OAUTH2_TOKEN_CACHE_MIN_TTL

Minimum TTL in seconds for MCP OAuth2 token cache. Default is 10

MCP_OAUTH2_TOKEN_EXPIRY_BUFFER_SECONDS

Seconds to subtract from token expiry when computing cache TTL. Default is 60

MCP_PER_USER_TOKEN_DEFAULT_TTL

Default TTL in seconds for per-user MCP OAuth tokens stored in Redis. Default is 43200 (12 hours)

MCP_PER_USER_TOKEN_EXPIRY_BUFFER_SECONDS

Seconds to subtract from per-user MCP OAuth token expiry when computing Redis TTL. Default is 60

MCP_TOKEN_EXCHANGE_CACHE_MAX_SIZE

Maximum number of entries in the MCP OAuth2 token exchange cache. Default is 500

MCP_TRUSTED_REDIRECT_ORIGINS

Comma-separated allowlist of additional redirect_uri origins accepted by the MCP OAuth authorize endpoint, beyond same-origin and loopback. Each entry is host or host:port; a *.suffix prefix matches any strictly-deeper subdomain. HTTPS only. Use this for first-party OAuth clients on sister domains (e.g. app.example.com). For ingressed deployments where the proxy's own origin is wrong, set PROXY_BASE_URL instead. See MCP OAuth — Reverse proxy and ingress configuration.

DEFAULT_MOCK_RESPONSE_COMPLETION_TOKEN_COUNT

Default token count for mock response completions. Default is 20

DEFAULT_MOCK_RESPONSE_PROMPT_TOKEN_COUNT

Default token count for mock response prompts. Default is 10

DEFAULT_MODEL_CREATED_AT_TIME

Default creation timestamp for models. Default is 1677610602

DEFAULT_NUM_WORKERS_LITELLM_PROXY

Default number of workers for LiteLLM proxy when NUM_WORKERS is not set. Default is 1. We strongly recommend setting NUM_WORKERS to the number of vCPUs available (e.g. NUM_WORKERS=8 or --num_workers 8).

DEFAULT_PROMPT_INJECTION_SIMILARITY_THRESHOLD

Default threshold for prompt injection similarity. Default is 0.7

DEFAULT_POLLING_INTERVAL

Default polling interval for schedulers in seconds. Default is 0.03

DEFAULT_REASONING_EFFORT_DISABLE_THINKING_BUDGET

Default reasoning effort disable thinking budget. Default is 0

DEFAULT_REASONING_EFFORT_HIGH_THINKING_BUDGET

Default high reasoning effort thinking budget. Default is 4096

DEFAULT_REASONING_EFFORT_LOW_THINKING_BUDGET

Default low reasoning effort thinking budget. Default is 1024

DEFAULT_REASONING_EFFORT_MAX_THINKING_BUDGET

Default max reasoning effort thinking budget for legacy Anthropic models that use thinking.budget_tokens (Claude 4.5 series + Haiku). On Claude 4.6/4.7 the max tier is routed via adaptive output_config.effort=max instead and ignores this constant. Default is 16384

DEFAULT_REASONING_EFFORT_MEDIUM_THINKING_BUDGET

Default medium reasoning effort thinking budget. Default is 2048

DEFAULT_REASONING_EFFORT_MINIMAL_THINKING_BUDGET

Default minimal reasoning effort thinking budget. Default is 512

DEFAULT_REASONING_EFFORT_MINIMAL_THINKING_BUDGET_GEMINI_2_5_FLASH

Default minimal reasoning effort thinking budget for Gemini 2.5 Flash. Default is 512

DEFAULT_REASONING_EFFORT_MINIMAL_THINKING_BUDGET_GEMINI_2_5_FLASH_LITE

Default minimal reasoning effort thinking budget for Gemini 2.5 Flash Lite. Default is 512

DEFAULT_REASONING_EFFORT_MINIMAL_THINKING_BUDGET_GEMINI_2_5_PRO

Default minimal reasoning effort thinking budget for Gemini 2.5 Pro. Default is 512

DEFAULT_REASONING_EFFORT_XHIGH_THINKING_BUDGET

Default xhigh reasoning effort thinking budget for legacy Anthropic models that use thinking.budget_tokens. Continues the 2× progression 1024 → 2048 → 4096 → 8192 from low/medium/high. On Claude 4.6/4.7 the xhigh tier is routed via adaptive output_config.effort=xhigh instead and ignores this constant. Default is 8192

DEFAULT_REDIS_MAJOR_VERSION

Default Redis major version to assume when version cannot be determined. Default is 7

DEFAULT_REDIS_SYNC_INTERVAL

Default Redis synchronization interval in seconds. Default is 1

DEFAULT_SEMANTIC_GUARD_EMBEDDING_MODEL

Default embedding model for Semantic Guard (route-matching guardrail). Default is "text-embedding-3-small"

DEFAULT_SEMANTIC_GUARD_SIMILARITY_THRESHOLD

Default similarity threshold for Semantic Guard route matching. Default is 0.75

DEFAULT_REPLICATE_GPU_PRICE_PER_SECOND

Default price per second for Replicate GPU. Default is 0.001400

DEFAULT_REPLICATE_POLLING_DELAY_SECONDS

Default delay in seconds for Replicate polling. Default is 1

DEFAULT_REPLICATE_POLLING_RETRIES

Default number of retries for Replicate polling. Default is 5

DEFAULT_SQS_BATCH_SIZE

Default batch size for SQS logging. Default is 512

DEFAULT_SQS_FLUSH_INTERVAL_SECONDS

Default flush interval for SQS logging. Default is 10

DEFAULT_S3_BATCH_SIZE

Default batch size for S3 logging. Default is 512

DEFAULT_S3_FLUSH_INTERVAL_SECONDS

Default flush interval for S3 logging. Default is 10

DEFAULT_SLACK_ALERTING_THRESHOLD

Default threshold for Slack alerting. Default is 300

DEFAULT_SOFT_BUDGET

Default soft budget for LiteLLM proxy keys. Default is 50.0

DEFAULT_TRIM_RATIO

Default ratio of tokens to trim from prompt end. Default is 0.75

DEFAULT_GOOGLE_VIDEO_DURATION_SECONDS

Default duration for video generation in seconds in google. Default is 8

DIRECT_URL

Direct URL for service endpoint

DISABLE_ADMIN_UI

Toggle to disable the admin UI

LITELLM_HIDE_DEFAULT_CREDENTIALS_HINT

Flag to hide the "Default Credentials" info card on the admin UI login page (/ui/login and /fallback/login). Useful when UI credentials are managed via UI_USERNAME / UI_PASSWORD or SSO and the hardcoded hint about admin + MASTER_KEY becomes misleading or is flagged by security scanners. Default is false

DISABLE_AIOHTTP_TRANSPORT

Flag to disable aiohttp transport. When this is set to True, litellm will use httpx instead of aiohttp. Default is False

DISABLE_AIOHTTP_TRUST_ENV

Flag to disable aiohttp trust environment. When this is set to True, litellm will not trust the environment for aiohttp eg. HTTP_PROXY and HTTPS_PROXY environment variables will not be used when this is set to True. Default is False

DISABLE_SCHEMA_UPDATE

Toggle to disable schema updates

DYNAMIC_RATE_LIMIT_ERROR_THRESHOLD_PER_MINUTE

Threshold for deployment failures per minute before enforcing rate limits in parallel request limiter. Default is 1

DOCS_DESCRIPTION

Description text for documentation pages

DOCS_FILTERED

Flag indicating filtered documentation

DOCS_TITLE

Title of the documentation pages

DOCS_URL

The path to the Swagger API documentation. By default this is "/"

EMAIL_LOGO_URL

URL for the logo used in emails

EMAIL_BUDGET_ALERT_TTL

Time-to-live for email budget alerts in seconds

EMAIL_BUDGET_ALERT_MAX_SPEND_ALERT_PERCENTAGE

Maximum spend percentage for triggering email budget alerts

EMAIL_SUPPORT_CONTACT

Support contact email address

EMAIL_SIGNATURE

Custom HTML footer/signature for all emails. Can include HTML tags for formatting and links.

EMAIL_SUBJECT_INVITATION

Custom subject template for invitation emails.

EMAIL_SUBJECT_KEY_CREATED

Custom subject template for key creation emails.

EMAIL_BUDGET_ALERT_MAX_SPEND_ALERT_PERCENTAGE

Percentage of max budget that triggers alerts (as decimal: 0.8 = 80%). Default is 0.8

EMAIL_BUDGET_ALERT_TTL

Time-to-live for budget alert deduplication in seconds. Default is 86400 (24 hours)

ENKRYPTAI_API_BASE

Base URL for EnkryptAI Guardrails API. Default is https://api.enkryptai.com

ENKRYPTAI_API_KEY

API key for EnkryptAI Guardrails service

FAROS_API_KEY

API key for sending LLM usage data to Faros AI

FAROS_API_URL

Base URL for the Faros AI API. Default is https://prod.api.faros.ai

FAROS_GRAPH

Faros graph that LiteLLM usage data is written to. Default is "default"

FAROS_ORIGIN

Origin recorded on rows written to Faros by LiteLLM. Default is "litellm"

FAROS_TOOL_CATEGORY

Tool category recorded on Faros vcs_UserTool rows. Default is "LiteLLM"

FAROS_USER_SOURCE

Source recorded on Faros vcs_User rows for LiteLLM users. Default is "LiteLLM"

FIREWORKS_AI_4_B

Size parameter for Fireworks AI 4B model. Default is 4

FIREWORKS_AI_16_B

Size parameter for Fireworks AI 16B model. Default is 16

FIREWORKS_AI_56_B_MOE

Size parameter for Fireworks AI 56B MOE model. Default is 56

FIREWORKS_AI_80_B

Size parameter for Fireworks AI 80B model. Default is 80

FIREWORKS_AI_176_B_MOE

Size parameter for Fireworks AI 176B MOE model. Default is 176

FOCUS_PROVIDER

Destination provider for Focus exports (e.g., s3). Defaults to s3.

FOCUS_FORMAT

Output format for Focus exports. Defaults to parquet.

FOCUS_FREQUENCY

Frequency for scheduled Focus exports (hourly, daily, or interval). Defaults to hourly.

FOCUS_CRON_OFFSET

Minute offset used when scheduling hourly/daily Focus exports. Defaults to 5 minutes.

FOCUS_INTERVAL_SECONDS

Interval (in seconds) for Focus exports when frequency is interval.

FOCUS_PREFIX

Object key prefix (or folder) used when uploading Focus export files. Defaults to focus_exports.

FOCUS_S3_BUCKET_NAME

S3 bucket to upload Focus export files when using the S3 destination.

FOCUS_S3_REGION_NAME

AWS region for the Focus export S3 bucket.

FOCUS_S3_ENDPOINT_URL

Custom endpoint for the Focus export S3 client (optional; useful for S3-compatible storage).

FOCUS_S3_ACCESS_KEY

AWS access key ID used by the Focus export S3 client.

FOCUS_S3_SECRET_KEY

AWS secret access key used by the Focus export S3 client.

FOCUS_S3_SESSION_TOKEN

AWS session token used by the Focus export S3 client (optional).

MAVVRIK_API_KEY

API key for the Mavvrik FOCUS export integration.

MAVVRIK_API_ENDPOINT

Tenant API endpoint for the Mavvrik FOCUS export, e.g. https://api.mavvrik.ai/<tenant_id>.

MAVVRIK_CONNECTION_ID

AI cost connection ID for the Mavvrik FOCUS export.

MAVVRIK_FOCUS_MAX_ROWS

Maximum rows per export window for the Mavvrik FOCUS destination. Default is 500000.

FOCUS_GCS_BUCKET_NAME

GCS bucket to upload Focus export files when using the GCS destination.

FOCUS_GCS_PATH_SERVICE_ACCOUNT

Path to a service account JSON key file for the Focus export GCS client. Falls back to Application Default Credentials if unset.

FUNCTION_DEFINITION_TOKEN_COUNT

Token count for function definitions. Default is 9

GALILEO_API_KEY

API key for Galileo Cloud (hosted). Used with the v2 spans API when success_callback includes galileo.

GALILEO_BASE_URL

Base URL for Galileo platform. For Galileo Cloud, use https://api.galileo.ai. For enterprise/self-hosted, replace console with api in your console URL.

GALILEO_LOG_STREAM_ID

Log stream ID for Galileo Cloud v2 spans logging (optional).

GALILEO_PASSWORD

Password for Galileo enterprise Observe authentication

GALILEO_PROJECT_ID

Project ID for Galileo usage

GALILEO_USERNAME

Username for Galileo enterprise Observe authentication

GOOGLE_SECRET_MANAGER_PROJECT_ID

Project ID for Google Secret Manager

GRACEFUL_SHUTDOWN_TIMEOUT

Seconds the proxy waits for in-flight requests to drain on shutdown (SIGTERM or the /health/drain preStop hook) before proceeding with teardown. Default is 30

GCS_BUCKET_NAME

Name of the Google Cloud Storage bucket

GCS_MOCK

Enable mock mode for GCS integration testing. When set to true, intercepts GCS API calls and returns mock responses without making actual network calls. Default is false

GCS_MOCK_LATENCY_MS

Mock latency in milliseconds for GCS API calls when mock mode is enabled. Simulates network round-trip time. Default is 150ms

GCS_PATH_SERVICE_ACCOUNT

Path to the Google Cloud service account JSON file

GCS_FLUSH_INTERVAL

Flush interval for GCS logging (in seconds). Specify how often you want a log to be sent to GCS. Default is 20 seconds

GCS_BATCH_SIZE

Batch size for GCS logging. Specify after how many logs you want to flush to GCS. If BATCH_SIZE is set to 10, logs are flushed every 10 logs. Default is 2048

GCS_USE_BATCHED_LOGGING

Enable batched logging for GCS. When enabled (default), multiple log payloads are combined into single GCS object uploads (NDJSON format), dramatically reducing API calls. When disabled, sends each log individually as separate GCS objects (legacy behavior). Default is true

GCS_PUBSUB_TOPIC_ID

PubSub Topic ID to send LiteLLM SpendLogs to.

GCS_PUBSUB_PROJECT_ID

PubSub Project ID to send LiteLLM SpendLogs to.

GENERIC_AUTHORIZATION_ENDPOINT

Authorization endpoint for generic OAuth providers

GENERIC_CLIENT_ID

Client ID for generic OAuth providers

GENERIC_CLIENT_SECRET

Client secret for generic OAuth providers

GENERIC_CLIENT_STATE

State parameter for generic client authentication

GENERIC_CLIENT_USE_PKCE

Enable PKCE (Proof Key for Code Exchange) for generic OAuth providers. Set to "true" when your OAuth provider requires PKCE. Default is false

GENERIC_SSO_HEADERS

Comma-separated list of additional headers to add to the request - e.g. Authorization=Bearer <token>, Content-Type=application/json, etc.

GENERIC_INCLUDE_CLIENT_ID

Include client ID in requests for OAuth

GENERIC_SCOPE

Scope settings for generic OAuth providers

GENERIC_TOKEN_ENDPOINT

Token endpoint for generic OAuth providers

GENERIC_USER_DISPLAY_NAME_ATTRIBUTE

Attribute for user's display name in generic auth

GENERIC_USER_EMAIL_ATTRIBUTE

Attribute for user's email in generic auth

GENERIC_USER_EXTRA_ATTRIBUTES

Comma-separated list of additional fields to extract from generic SSO provider response (e.g., "department,employee_id,groups"). Accessible via CustomOpenID.extra_fields in custom SSO handlers. Supports dot notation for nested fields

GENERIC_USER_FIRST_NAME_ATTRIBUTE

Attribute for user's first name in generic auth

GENERIC_USER_ID_ATTRIBUTE

Attribute for user ID in generic auth

GENERIC_USER_LAST_NAME_ATTRIBUTE

Attribute for user's last name in generic auth

GENERIC_USER_PROVIDER_ATTRIBUTE

Attribute specifying the user's provider

GENERIC_USER_ROLE_ATTRIBUTE

Attribute specifying the user's role

GENERIC_USERINFO_ENDPOINT

Endpoint to fetch user information in generic OAuth

GENERIC_LOGGER_ENDPOINT

Endpoint URL for the Generic Logger callback to send logs to

GENERIC_LOGGER_HEADERS

JSON string of headers to include in Generic Logger callback requests

GENERIC_ROLE_MAPPINGS_DEFAULT_ROLE

Default LiteLLM role to assign when no role mapping matches in generic SSO. Used with GENERIC_ROLE_MAPPINGS_ROLES

GENERIC_ROLE_MAPPINGS_GROUP_CLAIM

The claim/attribute name in the SSO token that contains the user's groups. Used for role mapping

GENERIC_ROLE_MAPPINGS_ROLES

Python dict string mapping LiteLLM roles to SSO group names. Example: {"proxy_admin": ["admin-group"], "internal_user": ["users"]}

GENERIC_USER_ROLE_MAPPINGS

Alternative to GENERIC_ROLE_MAPPINGS_ROLES for configuring user role mappings from SSO

GEMINI_API_BASE

Base URL for Gemini API. Default is https://generativelanguage.googleapis.com

GALILEO_API_KEY

API key for Galileo Cloud (hosted). Used with the v2 spans API when success_callback includes galileo.

GALILEO_BASE_URL

Base URL for Galileo platform. For Galileo Cloud, use https://api.galileo.ai. For enterprise/self-hosted, replace console with api in your console URL.

GALILEO_LOG_STREAM_ID

Log stream ID for Galileo Cloud v2 spans logging (optional).

GALILEO_PASSWORD

Password for Galileo enterprise Observe authentication

GALILEO_PROJECT_ID

Project ID for Galileo usage

GALILEO_USERNAME

Username for Galileo enterprise Observe authentication

GITHUB_COPILOT_TOKEN_DIR

Directory to store GitHub Copilot token for github_copilot llm provider

GITHUB_COPILOT_API_KEY_FILE

File to store GitHub Copilot API key for github_copilot llm provider

GITHUB_COPILOT_ACCESS_TOKEN_FILE

File to store GitHub Copilot access token for github_copilot llm provider

GITHUB_COPILOT_API_BASE

Base URL for GitHub Copilot API. For GitHub Enterprise subscriptions with custom host, it is similar to https://copilot-api.my-company.ghe.com. Default is https://api.githubcopilot.com

GITHUB_COPILOT_DEVICE_CODE_URL

URL for GitHub Copilot device code authentication. For GitHub Enterprise subscriptions with custom host, it is similar to https://my-company.ghe.com/login/device/code. Default is https://github.com/login/device/code

GITHUB_COPILOT_ACCESS_TOKEN_URL

URL for GitHub Copilot access token retrieval. For GitHub Enterprise subscriptions with custom host, it is similar to https://my-company.ghe.com/login/oauth/access_token. Default is https://github.com/login/oauth/access_token

GITHUB_COPILOT_API_KEY_URL

URL for GitHub Copilot API key retrieval. For GitHub Enterprise subscriptions with custom host, it is similar to https://my-company.ghe.com/api/v3/copilot_internal/v2/token. Default is https://api.github.com/copilot_internal/v2/token

GITHUB_COPILOT_CLIENT_ID

Client ID for GitHub Copilot device flow authentication. This is used by the github_copilot provider for device code authentication. Default is "Iv1.b507a08c87ecfe98"

GREENSCALE_API_KEY

API key for Greenscale service

GREENSCALE_ENDPOINT

Endpoint URL for Greenscale service

GRAYSWAN_API_BASE

Base URL for GraySwan API. Default is https://api.grayswan.ai

GRAYSWAN_API_KEY

API key for GraySwan Cygnal service

GRAYSWAN_REASONING_MODE

Reasoning mode for GraySwan guardrail

GRAYSWAN_VIOLATION_THRESHOLD

Violation threshold for GraySwan guardrail

GOOGLE_APPLICATION_CREDENTIALS

Path to Google Cloud credentials JSON file

GOOGLE_CLIENT_ID

Client ID for Google OAuth

GOOGLE_CLIENT_SECRET

Client secret for Google OAuth

GOOGLE_KMS_RESOURCE_NAME

Name of the resource in Google KMS

GUARDRAILS_AI_API_BASE

Base URL for Guardrails AI API

HEALTH_CHECK_TIMEOUT_SECONDS

Timeout in seconds for health checks. Default is 60

HEROKU_API_BASE

Base URL for Heroku API

HEROKU_API_KEY

API key for Heroku services

HF_API_BASE

Base URL for Hugging Face API

HCP_VAULT_ADDR

Address for Hashicorp Vault Secret Manager

HCP_VAULT_APPROLE_MOUNT_PATH

Mount path for AppRole authentication in Hashicorp Vault Secret Manager. Default is "approle"

HCP_VAULT_APPROLE_ROLE_ID

Role ID for AppRole authentication in Hashicorp Vault Secret Manager

HCP_VAULT_APPROLE_SECRET_ID

Secret ID for AppRole authentication in Hashicorp Vault Secret Manager

HCP_VAULT_CLIENT_CERT

Path to client certificate for Hashicorp Vault Secret Manager

HCP_VAULT_CLIENT_KEY

Path to client key for Hashicorp Vault Secret Manager

HCP_VAULT_MOUNT_NAME

Mount name for Hashicorp Vault Secret Manager

HCP_VAULT_NAMESPACE

Namespace for Hashicorp Vault Secret Manager

HCP_VAULT_PATH_PREFIX

Path prefix for Hashicorp Vault Secret Manager

HCP_VAULT_TOKEN

Token for Hashicorp Vault Secret Manager

HCP_VAULT_CERT_ROLE

Role for Hashicorp Vault Secret Manager Auth

HELICONE_API_KEY

API key for Helicone service

HELICONE_API_BASE

Base URL for Helicone service, defaults to https://api.helicone.ai

HELICONE_MOCK

Enable mock mode for Helicone integration testing. When set to true, intercepts Helicone API calls and returns mock responses without making actual network calls. Default is false

HELICONE_MOCK_LATENCY_MS

Mock latency in milliseconds for Helicone API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms

HOSTNAME

Hostname for the server, this will be emitted to datadog logs

HOURS_IN_A_DAY

Hours in a day for calculation purposes. Default is 24

HIDDENLAYER_API_BASE

Base URL for HiddenLayer API. Defaults to https://api.hiddenlayer.ai

HIDDENLAYER_AUTH_URL

Authentication URL for HiddenLayer. Defaults to https://auth.hiddenlayer.ai

HIDDENLAYER_CLIENT_ID

Client ID for HiddenLayer SaaS authentication

HIDDENLAYER_CLIENT_SECRET

Client secret for HiddenLayer SaaS authentication

HUGGINGFACE_API_BASE

Base URL for Hugging Face API

HUGGINGFACE_API_KEY

API key for Hugging Face API

HUMANLOOP_PROMPT_CACHE_TTL_SECONDS

Time-to-live in seconds for cached prompts in Humanloop. Default is 60

IAM_TOKEN_DB_AUTH

IAM token for database authentication

IBM_GUARDRAILS_API_BASE

Base URL for IBM Guardrails API

IBM_GUARDRAILS_AUTH_TOKEN

Authorization bearer token for IBM Guardrails API

INITIAL_RETRY_DELAY

Initial delay in seconds for retrying requests. Default is 0.5

JITTER

Jitter factor for retry delay calculations. Default is 0.75

JSON_LOGS

Enable JSON formatted logging

JWT_AUDIENCE

Expected audience for JWT tokens

JWT_ISSUER

Expected issuer (iss claim) for JWT tokens. When set, PyJWT verifies the iss claim and rejects tokens from other issuers

JWT_PUBLIC_KEY_URL

URL to fetch public key for JWT verification

LAGO_API_BASE

Base URL for Lago API

LAGO_API_CHARGE_BY

Parameter to determine charge basis in Lago

LAGO_API_EVENT_CODE

Event code for Lago API events

LAGO_API_KEY

API key for accessing Lago services

LANGFUSE_BASE_URL

Base URL for Langfuse service

LANGFUSE_DEBUG

Toggle debug mode for Langfuse

LANGFUSE_FLUSH_INTERVAL

Interval for flushing Langfuse logs

LANGFUSE_TRACING_ENVIRONMENT

Environment for Langfuse tracing

LANGFUSE_HOST

Deprecated host URL for Langfuse service

LANGFUSE_MOCK

Enable mock mode for Langfuse integration testing. When set to true, intercepts Langfuse API calls and returns mock responses without making actual network calls. Default is false

LANGFUSE_MOCK_LATENCY_MS

Mock latency in milliseconds for Langfuse API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms

LANGFUSE_PUBLIC_KEY

Public key for Langfuse authentication

LANGFUSE_RELEASE

Release version of Langfuse integration

LANGFUSE_SECRET_KEY

Secret key for Langfuse authentication

LANGFUSE_PROPAGATE_TRACE_ID

Flag to enable propagating trace ID to Langfuse. Default is False

LANGSMITH_API_KEY

API key for Langsmith platform

LANGSMITH_BASE_URL

Base URL for Langsmith service

LANGSMITH_BATCH_SIZE

Batch size for operations in Langsmith

LANGSMITH_DEFAULT_RUN_NAME

Default name for Langsmith run

LANGSMITH_PROJECT

Project name for Langsmith integration

LANGSMITH_SAMPLING_RATE

Sampling rate for Langsmith logging

LANGSMITH_TENANT_ID

Tenant ID for Langsmith multi-tenant deployments

LANGSMITH_MOCK

Enable mock mode for Langsmith integration testing. When set to true, intercepts Langsmith API calls and returns mock responses without making actual network calls. Default is false

LANGSMITH_MOCK_LATENCY_MS

Mock latency in milliseconds for Langsmith API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms

LANGTRACE_API_KEY

API key for Langtrace service

LASSO_API_BASE

Base URL for Lasso API

LASSO_API_KEY

API key for Lasso service

LASSO_USER_ID

User ID for Lasso service

LASSO_CONVERSATION_ID

Conversation ID for Lasso service

LENGTH_OF_LITELLM_GENERATED_KEY

Length of keys generated by LiteLLM. Default is 16

LEGACY_MULTI_INSTANCE_RATE_LIMITING

Flag to enable legacy multi-instance rate limiting. Default is False

LITERAL_API_KEY

API key for Literal integration

LITERAL_API_URL

API URL for Literal service

LITERAL_BATCH_SIZE

Batch size for Literal operations

LITELLM_ANTHROPIC_BETA_HEADERS_URL

Custom URL for fetching Anthropic beta headers configuration. Default is the GitHub main branch URL

LITELLM_ANTHROPIC_DISABLE_URL_SUFFIX

Disable automatic URL suffix appending for Anthropic API base URLs. When set to true, prevents LiteLLM from automatically adding /v1/messages or /v1/complete to custom Anthropic API endpoints

LITELLM_ASSETS_PATH

Path to directory for UI assets and logos. Used when running with read-only filesystem (e.g., Kubernetes). Default is /var/lib/litellm/assets in Docker.

LITELLM_BLOG_POSTS_URL

Custom URL for fetching LiteLLM blog posts JSON. Default is the GitHub main branch URL

LITELLM_CLI_JWT_EXPIRATION_HOURS

Expiration time in hours for CLI-generated JWT tokens. Default is 24 hours

LITELLM_CLI_SSO_CLAIM_MAP

Alias for CLI_SSO_CLAIM_MAP — allowlisted OIDC claims for CLI SSO attribution metadata

LITELLM_CORS_ALLOW_CREDENTIALS

Set to true to explicitly allow credentials in CORS responses. When not set, credentials are disabled automatically if LITELLM_CORS_ORIGINS is * (wildcard) to prevent the browser security misconfiguration of reflecting any origin with credentials

LITELLM_CORS_ORIGINS

Comma-separated list of allowed CORS origins (e.g. https://app.example.com,https://admin.example.com). Defaults to * (all origins) when not set

LITELLM_DD_AGENT_HOST

Hostname or IP of DataDog agent for LiteLLM-specific logging. When set, logs are sent to agent instead of direct API

LITELLM_DEPLOYMENT_ENVIRONMENT

Environment name for the deployment (e.g., "production", "staging"). Used as a fallback when OTEL_ENVIRONMENT_NAME is not set. Sets the environment tag in telemetry data

LITELLM_DETAILED_TIMING

When true, adds detailed per-phase timing headers to responses (x-litellm-timing-{pre-processing,llm-api,post-processing,message-copy}-ms). Default is false. See latency overhead docs

LITELLM_DD_AGENT_PORT

Port of DataDog agent for LiteLLM-specific log intake. Default is 10518

LITELLM_DD_LLM_OBS_PORT

Port for Datadog LLM Observability agent. Default is 8126

LITELLM_DEFAULT_EMBEDDING_ENCODING_FORMAT

Default encoding_format for OpenAI-compatible embedding calls when it is not set on the request or in model litellm_params (e.g. float, base64). Fallback is float. See Embeddings.

LITELLM_DEV_ENV_HOT_RELOAD

Internal flag the proxy sets on itself when started with --reload, signalling reloaded workers to re-read .env with override=True so edits to existing keys take effect on reload. Not meant to be set by users

LITELLM_DONT_SHOW_FEEDBACK_BOX

Flag to hide feedback box in LiteLLM UI

LITELLM_DROP_PARAMS

Parameters to drop in LiteLLM requests

LITELLM_MODIFY_PARAMS

Parameters to modify in LiteLLM requests

LITELLM_EMAIL

Email associated with LiteLLM account

LITELLM_FAVICON_URL

Custom URL for the LiteLLM UI favicon. When set, overrides the default favicon

LITELLM_GLOBAL_MAX_PARALLEL_REQUEST_RETRIES

Maximum retries for parallel requests in LiteLLM

LITELLM_GLOBAL_MAX_PARALLEL_REQUEST_RETRY_TIMEOUT

Timeout for retries of parallel requests in LiteLLM

LITELLM_DISABLE_LAZY_LOADING

When set to "1", "true", "yes", or "on", disables lazy loading of attributes (currently only affects encoding/tiktoken). This ensures encoding is initialized before VCR starts recording HTTP requests, fixing VCR cassette creation issues. See issue #18659

LITELLM_DISABLE_REDACT_SECRETS

When set to "true", disables automatic redaction of secrets (API keys, tokens, credentials) from proxy log output. Secret redaction is enabled by default.

LITELLM_MIGRATION_DIR

Custom migrations directory for prisma migrations, used for baselining db in read-only file systems.

LITELLM_HOSTED_UI

URL of the hosted UI for LiteLLM

LITELLM_UI_API_DOC_BASE_URL

Optional override for the API Reference base URL (used in sample code/docs) when the admin UI runs on a different host than the proxy. Defaults to PROXY_BASE_URL when unset.

LITELLM_UI_PATH

Path to directory for Admin UI files. Used when running with read-only filesystem (e.g., Kubernetes). Default is /var/lib/litellm/ui in Docker.

LITELLM_UI_SESSION_DURATION

Duration for UI login session (username/password, SSO, invitation links). Format: "30s", "30m", "24h", "7d". Does not apply to EXPERIMENTAL_UI_LOGIN flow, which uses a fixed 10-minute expiry for security. Default is "24h"

LITELLM_EXPIRED_UI_SESSION_KEY_CLEANUP_BATCH_SIZE

Maximum number of expired LiteLLM dashboard session keys to delete per cleanup run. Default is 1000.

LITELLM_EXPIRED_UI_SESSION_KEY_CLEANUP_ENABLED

Set to true to enable the background cleanup job for expired LiteLLM dashboard session keys. Default is false.

LITELLM_EXPIRED_UI_SESSION_KEY_CLEANUP_INTERVAL_SECONDS

Interval in seconds for how often to run the expired LiteLLM dashboard session key cleanup job. Default is 86400 (24 hours).

LITELM_ENVIRONMENT

Environment of LiteLLM Instance, used by logging services. Currently only used by DeepEval.

LITELLM_KEY_ROTATION_ENABLED

Enable auto-key rotation for LiteLLM (boolean). Default is false.

LITELLM_KEY_ROTATION_CHECK_INTERVAL_SECONDS

Interval in seconds for how often to run job that auto-rotates keys. Default is 86400 (24 hours).

LITELLM_KEY_ROTATION_GRACE_PERIOD

Duration to keep old key valid after rotation (e.g. "24h", "2d"). Default is empty (immediate revoke). Used for scheduled rotations and as fallback when not specified in regenerate request.

LITELLM_KEY_ROTATION_LOCK_TTL_SECONDS

TTL in seconds for the distributed lock used by the key rotation job. Default is 600 (10 minutes).

LITELLM_LICENSE

License key for LiteLLM usage

LITELLM_LOCAL_ANTHROPIC_BETA_HEADERS

Set to True to use the local bundled Anthropic beta headers config only, disabling remote fetching. Default is False

LITELLM_OIDC_ALLOWED_CREDENTIAL_DIRS

Comma-separated list of absolute directories from which the oidc/file/ provider is permitted to read token files. Defaults to /var/run/secrets,/run/secrets.

LITELLM_LOCAL_BLOG_POSTS

When set to True, uses the local bundled blog posts only, disabling remote fetching from GitHub. Default is False

LITELLM_LOCAL_MODEL_COST_MAP

Local configuration for model cost mapping in LiteLLM

LITELLM_LOCAL_POLICY_TEMPLATES

When set to "true", uses local backup policy templates instead of fetching from GitHub. Policy templates are fetched from https://raw.githubusercontent.com/BerriAI/litellm/main/policy_templates.json by default, with automatic fallback to local backup on failure

LITELLM_LOG

Enable detailed logging for LiteLLM

LITELLM_MODEL_COST_MAP_URL

URL for fetching model cost map data. Default is https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json

LITELLM_LOG_FILE

File path to write LiteLLM logs to. When set, logs will be written to both console and the specified file

LITELLM_LOGGER_NAME

Name for OTEL logger

LITELLM_METER_NAME

Name for OTEL Meter

LITELLM_OTEL_INTEGRATION_ENABLE_EVENTS

Optionally enable semantic logs (gen_ai.content.prompt/gen_ai.content.completion, or gen_ai.client.inference.operation.details in semconv mode) for OTEL. Default false. See OpenTelemetry

LITELLM_OTEL_INTEGRATION_ENABLE_METRICS

Optionally enable semantic metrics (TTFT, TPOT, response duration, cost, token usage) for OTEL. Default false. See OpenTelemetry

LITELLM_OTEL_BAGGAGE_TEAM_METADATA_KEYS

Comma-separated allowlist of team-metadata sub-keys promoted onto OTEL spans under litellm.team.metadata. Empty by default, so none of a team's free-form metadata is sent to your tracing backend until each sub-key is explicitly allowlisted. Also settable as baggage_team_metadata_keys under callback_settings.otel in config.yaml. See OpenTelemetry.

LITELLM_ENABLE_PYROSCOPE

If true, enables Pyroscope CPU profiling. Profiles are sent to PYROSCOPE_SERVER_ADDRESS. Off by default. See Pyroscope profiling.

LITELLM_ENABLE_TEAM_STALE_ALIAS_BYPASS

When true, if a team's legacy model_aliases entry maps a public model name to an internal model_name_<team_id>_<uuid> deployment, pre-call handling can skip that rewrite when team-scoped sibling deployments exist for the public name—so load balancing / order apply across siblings. Default is false for backwards compatibility. See Team-scoped models and legacy aliases. When stale aliases are detected and this flag is off, the proxy may log a one-time warning.

PYROSCOPE_APP_NAME

Application name reported to Pyroscope. Required when LITELLM_ENABLE_PYROSCOPE is true. No default.

PYROSCOPE_SERVER_ADDRESS

Pyroscope server URL to send profiles to. Required when LITELLM_ENABLE_PYROSCOPE is true. No default.

PYROSCOPE_SAMPLE_RATE

Optional. Sample rate for Pyroscope profiling (integer). No default; when unset, the pyroscope-io library default is used.

PYROSCOPE_GRAFANA_USER

Optional. Grafana Cloud Pyroscope user/tenant ID for basic auth. Required when PYROSCOPE_GRAFANA_API_TOKEN is set.

PYROSCOPE_GRAFANA_API_TOKEN

Optional. Grafana Cloud API/access policy token for Pyroscope basic auth. Required when PYROSCOPE_GRAFANA_USER is set.

LITELLM_MASTER_KEY

Master key for proxy authentication

LITELLM_MAX_BUDGET_PER_SESSION_TTL

TTL in seconds for session budget counters used by the max-budget-per-session limiter. Default is 3600 (1 hour)

LITELLM_MAX_ITERATIONS_TTL

TTL in seconds for session iteration counters used by the max-iterations limiter. Default is 3600 (1 hour)

LITELLM_MAX_STREAMING_DURATION_SECONDS

Maximum duration in seconds allowed for a streaming response. Streams exceeding this duration are terminated with a Timeout error. Default is None (no limit)

LITELLM_STREAM_INACTIVITY_TIMEOUT_SECONDS

Maximum seconds to wait for the next chunk from an async streaming provider before raising a Timeout. Guards against a provider that keeps the connection warm with keepalive bytes but stops sending content. Default is None (disabled)

LITELLM_MODE

Operating mode for LiteLLM (e.g., production, development)

LITELLM_NON_ROOT

Flag to run LiteLLM in non-root mode for enhanced security in Docker containers

LITELLM_RATE_LIMIT_WINDOW_SIZE

Rate limit window size for LiteLLM. Default is 60

LITELLM_REASONING_AUTO_SUMMARY

If set to "true", automatically enables detailed reasoning summaries (summary: "detailed") for reasoning models across all translation paths (Anthropic adapter, Responses API, etc.). Default is "false"

LITELLM_SALT_KEY

Salt key for encryption in LiteLLM

LITELLM_SENSITIVE_ROUTING_TTL

TTL in seconds for sticky sensitive-data routing decisions; controls how long a session stays pinned to the on-premise model selected by a routing guardrail. Default is 3600

LITELLM_SSL_CIPHERS

SSL/TLS cipher configuration for faster handshakes. Controls cipher suite preferences for OpenSSL connections.

LITELLM_SECRET_AWS_KMS_LITELLM_LICENSE

AWS KMS encrypted license for LiteLLM

LITELLM_TOKEN

Access token for LiteLLM integration

LITELLM_TPM_TOKEN_RESERVATION_ENABLED

When false, the v3 rate limiter skips the upfront TPM token reservation and enforces TPM post-call from actual usage. Default is true

LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES

When set to "true", routes OpenAI /v1/messages requests through chat/completions instead of the Responses API for Anthropic models. Can also be set via litellm_settings.use_chat_completions_url_for_anthropic_messages

LITELLM_ROUTE_ALL_CHAT_OPENAI_TO_RESPONSES

When set to "true", routes all OpenAI /chat/completions requests through the Responses API bridge. Recommended for OpenAI models. Can also be set via litellm_settings.route_all_chat_openai_to_responses

LITELLM_GEMINI_LIVE_DEFER_SETUP

When set to "true", defers Gemini/Vertex Live setup until the client sends session.update (required for runtime tool injection). Default is "false" for backwards compatibility, which auto-sends setup on connect. Can also be set via litellm.gemini_live_defer_setup

LITELLM_USE_LEGACY_INTERACTIONS_SCHEMA

When set to "true", uses the legacy Google Interactions API schema (outputs array, 2026-05-07 revision) instead of the new schema (steps array, 2026-05-20 revision). The legacy schema will be sunset on June 8, 2026. Can also be set via litellm_settings.use_legacy_interactions_schema

LITELLM_USER_AGENT

Custom user agent string for LiteLLM API requests. Used for partner telemetry attribution

LITELLM_WORKER_STARTUP_HOOKS

Comma-separated list of module.path:function_name callables to run in each worker process during startup. Runs early in the worker lifecycle (before config/DB loading). Useful for re-initializing per-process state like gflags. See Worker Startup Hooks for details

LITELLM_PRINT_STANDARD_LOGGING_PAYLOAD

If true, prints the standard logging payload to the console - useful for debugging

LITELM_ENVIRONMENT

Environment for LiteLLM Instance. This is currently only logged to DeepEval to determine the environment for DeepEval integration.

LITELLM_ASYNCIO_QUEUE_MAXSIZE

Maximum size for asyncio queues (e.g. log queues, spend update queues, and cookbook examples such as realtime audio in nova_sonic_realtime.py). Bounds in-memory growth to prevent OOM. Default is 1000.

LOGFIRE_TOKEN

Token for Logfire logging service

LOGFIRE_BASE_URL

Base URL for Logfire logging service (useful for self hosted deployments)

LOGGING_WORKER_CONCURRENCY

Maximum number of concurrent coroutine slots for the logging worker on the asyncio event loop. Default is 100. Setting too high will flood the event loop with logging tasks which will lower the overall latency of the requests.

LOGGING_WORKER_MAX_QUEUE_SIZE

Maximum size of the logging worker queue. When the queue is full, the worker aggressively clears tasks to make room instead of dropping logs. Default is 50,000

LOGGING_WORKER_MAX_TIME_PER_COROUTINE

Maximum time in seconds allowed for each coroutine in the logging worker before timing out. Default is 20.0

LOGGING_WORKER_CLEAR_PERCENTAGE

Percentage of the queue to extract when clearing. Default is 50%

MAX_BASE64_LENGTH_FOR_LOGGING

Maximum number of base64 characters to keep in logging payloads. Data URIs exceeding this are replaced with a size placeholder. Set to 0 to disable truncation. Default is 64

MAX_COMPETITOR_NAMES

Maximum number of competitor names allowed in policy template enrichment. Default is 100

MAX_EXCEPTION_MESSAGE_LENGTH

Maximum length for exception messages. Default is 2000

MAX_ITERATIONS_TO_CLEAR_QUEUE

Maximum number of iterations to attempt when clearing the logging worker queue during shutdown. Default is 200

MAX_TIME_TO_CLEAR_QUEUE

Maximum time in seconds to spend clearing the logging worker queue during shutdown. Default is 5.0

LOGGING_WORKER_AGGRESSIVE_CLEAR_COOLDOWN_SECONDS

Cooldown time in seconds before allowing another aggressive clear operation when the queue is full. Default is 0.5

MAX_STRING_LENGTH_PROMPT_IN_DB

Maximum length for strings in spend logs when sanitizing request bodies. Strings longer than this will be truncated. Default is 1000

MAX_IN_MEMORY_QUEUE_FLUSH_COUNT

Maximum count for in-memory queue flush operations. Default is 1000

MAX_IMAGE_URL_DOWNLOAD_SIZE_MB

Maximum size in MB for downloading images from URLs. Prevents memory issues from downloading very large images. Images exceeding this limit will be rejected before download. Set to 0 to completely disable image URL handling (all image_url requests will be blocked). Default is 50MB (matching OpenAI's limit)

MAX_LONG_SIDE_FOR_IMAGE_HIGH_RES

Maximum length for the long side of high-resolution images. Default is 2000

MAX_REDIS_BUFFER_DEQUEUE_COUNT

Maximum count for Redis buffer dequeue operations. Default is 100

MAX_SHORT_SIDE_FOR_IMAGE_HIGH_RES

Maximum length for the short side of high-resolution images. Default is 768

MAX_SIZE_IN_MEMORY_QUEUE

Maximum size for in-memory queue. Default is 10000

MAX_SIZE_PER_ITEM_IN_MEMORY_CACHE_IN_KB

Maximum size in KB for each item in memory cache. Default is 512 or 1024

MAX_SPENDLOG_ROWS_TO_QUERY

Maximum number of spend log rows to query. Default is 1,000,000

MAX_TEAM_LIST_LIMIT

Maximum number of teams to list. Default is 20

MAX_TILE_HEIGHT

Maximum height for image tiles. Default is 512

MAX_TILE_WIDTH

Maximum width for image tiles. Default is 512

MAX_TOKEN_TRIMMING_ATTEMPTS

Maximum number of attempts to trim a token message. Default is 10

MAXIMUM_TRACEBACK_LINES_TO_LOG

Maximum number of lines to log in traceback in LiteLLM Logs UI. Default is 100

MAX_RETRY_DELAY

Maximum delay in seconds for retrying requests. Default is 8.0

MAX_LANGFUSE_INITIALIZED_CLIENTS

Maximum number of Langfuse clients to initialize on proxy. Default is 50. This is set since langfuse initializes 1 thread everytime a client is initialized. We've had an incident in the past where we reached 100% cpu utilization because Langfuse was initialized several times.

MAX_MCP_SEMANTIC_FILTER_TOOLS_HEADER_LENGTH

Maximum header length for MCP semantic filter tools. Default is 150

MAX_POLICY_ESTIMATE_IMPACT_ROWS

Maximum number of rows returned when estimating the impact of a policy. Default is 1000

MAX_PAYLOAD_SIZE_FOR_DEBUG_LOG

Maximum payload size in bytes for full DEBUG serialization. Payloads exceeding this will be truncated in logs. Default is 102400 (100 KB)

MIN_NON_ZERO_TEMPERATURE

Minimum non-zero temperature value. Default is 0.0001

MINIMUM_PROMPT_CACHE_TOKEN_COUNT

Minimum token count for caching a prompt. Default is 1024

MISTRAL_API_BASE

Base URL for Mistral API. Default is https://api.mistral.ai

MISTRAL_API_KEY

API key for Mistral API

MICROSOFT_AUTHORIZATION_ENDPOINT

Custom authorization endpoint URL for Microsoft SSO (overrides default Microsoft OAuth authorization endpoint)

MICROSOFT_CLIENT_ID

Client ID for Microsoft services

MICROSOFT_CLIENT_SECRET

Client secret for Microsoft services

MICROSOFT_SERVICE_PRINCIPAL_ID

Service Principal ID for Microsoft Enterprise Application. (This is an advanced feature if you want litellm to auto-assign members to Litellm Teams based on their Microsoft Entra ID Groups)

MICROSOFT_TENANT

Tenant ID for Microsoft Azure

MICROSOFT_TOKEN_ENDPOINT

Custom token endpoint URL for Microsoft SSO (overrides default Microsoft OAuth token endpoint)

MICROSOFT_USER_DISPLAY_NAME_ATTRIBUTE

Field name for user display name in Microsoft SSO response. Default is displayName

MICROSOFT_USER_EMAIL_ATTRIBUTE

Field name for user email in Microsoft SSO response. Default is userPrincipalName

MICROSOFT_USER_FIRST_NAME_ATTRIBUTE

Field name for user first name in Microsoft SSO response. Default is givenName

MICROSOFT_USER_ID_ATTRIBUTE

Field name for user ID in Microsoft SSO response. Default is id

MICROSOFT_USER_LAST_NAME_ATTRIBUTE

Field name for user last name in Microsoft SSO response. Default is surname

MICROSOFT_USERINFO_ENDPOINT

Custom userinfo endpoint URL for Microsoft SSO (overrides default Microsoft Graph userinfo endpoint)

MODEL_COST_MAP_MAX_SHRINK_RATIO

Maximum allowed shrinkage ratio when validating a fetched model cost map against the local backup. Rejects the fetched map if it is smaller than this fraction of the backup. Default is 0.5

MODEL_COST_MAP_MIN_MODEL_COUNT

Minimum number of models a fetched cost map must contain to be considered valid. Default is 50

NEW_RELIC_APP_NAME

Application name for New Relic AI Monitoring integration

NEW_RELIC_LICENSE_KEY

License key for New Relic authentication

NO_DOCS

Flag to disable Swagger UI documentation

NO_OPENAPI

Flag to disable the /openapi.json endpoint

NO_REDOC

Flag to disable Redoc documentation

NO_PROXY

List of addresses to bypass proxy

NON_LLM_CONNECTION_TIMEOUT

Timeout in seconds for non-LLM service connections. Default is 15

OAUTH_TOKEN_INFO_ENDPOINT

Endpoint for OAuth token info retrieval

OPENAI_BASE_URL

Base URL for OpenAI API

OPENAI_API_BASE

Base URL for OpenAI API. Default is https://api.openai.com/

OPENAI_API_KEY

API key for OpenAI services

OPENAI_CHATGPT_API_BASE

Alternative to CHATGPT_API_BASE. Base URL for ChatGPT API

OPENAI_FILE_SEARCH_COST_PER_1K_CALLS

Cost per 1000 calls for OpenAI file search. Default is 0.0025

OPENAI_ORGANIZATION

Organization identifier for OpenAI

OPENAPI_URL

The path to the OpenAPI JSON endpoint. By default this is "/openapi.json"

OPENID_BASE_URL

Base URL for OpenID Connect services

OPENID_CLIENT_ID

Client ID for OpenID Connect authentication

OPENID_CLIENT_SECRET

Client secret for OpenID Connect authentication

OPENMETER_API_ENDPOINT

API endpoint for OpenMeter integration

OPENMETER_API_KEY

API key for OpenMeter services

OPENMETER_EVENT_TYPE

Type of events sent to OpenMeter

OPENMETER_TRUST_REQUEST_USER

If false, ignore the request body user and resolve the OpenMeter subject from the authenticated key's user_id. Defaults to true

ONYX_API_BASE

Base URL for Onyx Security AI Guard service (defaults to https://ai-guard.onyx.security)

ONYX_API_KEY

API key for Onyx Security AI Guard service

ONYX_TIMEOUT

Timeout in seconds for Onyx Guard server requests. Default is 10

OTEL_ENDPOINT

OpenTelemetry endpoint for traces

OTEL_EXPORTER_OTLP_ENDPOINT

OpenTelemetry endpoint for traces

OTEL_ENVIRONMENT_NAME

Environment name for OpenTelemetry

OTEL_EXPORTER

Exporter type for OpenTelemetry

OTEL_EXPORTER_OTLP_PROTOCOL

Exporter type for OpenTelemetry

OTEL_HEADERS

Headers for OpenTelemetry requests

OTEL_MODEL_ID

Model ID for OpenTelemetry tracing

OTEL_EXPORTER_OTLP_HEADERS

Headers for OpenTelemetry requests

OTEL_SERVICE_NAME

Service name identifier for OpenTelemetry

OTEL_TRACER_NAME

Tracer name for OpenTelemetry tracing

OTEL_LOGS_EXPORTER

Exporter type for OpenTelemetry logs (e.g., console)

OTEL_IGNORE_CONTEXT_PROPAGATION

When true, ignore parent span context propagation (inbound traceparent headers and any active span) so every LiteLLM trace is its own root. Default false

OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT

Controls whether prompts and completions are captured in OpenTelemetry traces. Accepts NO_CONTENT (default per spec), SPAN_ONLY, EVENT_ONLY, SPAN_AND_EVENT, or the boolean form (true maps to EVENT_ONLY, false to NO_CONTENT)

OTEL_SEMCONV_STABILITY_OPT_IN

Set to gen_ai_latest_experimental to emit spans following the latest OpenTelemetry GenAI semantic conventions. Renames the LLM-call span to {operation} {model}, suppresses raw_gen_ai_request, adds gen_ai.provider.name, and consolidates events. Comma-separable per OTEL spec

USE_OTEL_LITELLM_REQUEST_SPAN

When true, the proxy emits a discrete litellm_request span per LLM call as a child of the Received Proxy Server Request span. Default false (since v1.81.0); LLM-call attributes are set directly on the proxy root span. See Why don't I see a litellm_request span?

OTEL_DEBUG

When true, prints exporter and span-creation diagnostics to stderr. Useful when traces aren't reaching your backend. Default false

DEBUG_OTEL

Alias for OTEL_DEBUG

PAGERDUTY_API_KEY

API key for PagerDuty Alerting

PANW_PRISMA_AIRS_API_KEY

API key for PANW Prisma AIRS service

PANW_PRISMA_AIRS_API_BASE

Base URL for PANW Prisma AIRS service

PHOENIX_API_KEY

API key for Arize Phoenix

PHOENIX_COLLECTOR_ENDPOINT

API endpoint for Arize Phoenix

PHOENIX_COLLECTOR_HTTP_ENDPOINT

API http endpoint for Arize Phoenix

PILLAR_API_BASE

Base URL for Pillar API Guardrails

PILLAR_API_KEY

API key for Pillar API Guardrails

PILLAR_ON_FLAGGED_ACTION

Action to take when content is flagged ('block' or 'monitor')

PKCE_STRICT_CACHE_MISS

When set to true, the SSO callback will return a 401 error if the PKCE code_verifier is not found in the cache (e.g. due to a cache miss across pods). When false (default), it logs a warning and continues without the code_verifier.

POD_NAME

Pod name for the server, this will be emitted to datadog logs as POD_NAME

POSTHOG_API_KEY

API key for PostHog analytics integration

POSTHOG_API_URL

Base URL for PostHog API (defaults to https://us.i.posthog.com)

POSTHOG_MOCK

Enable mock mode for PostHog integration testing. When set to true, intercepts PostHog API calls and returns mock responses without making actual network calls. Default is false

POSTHOG_MOCK_LATENCY_MS

Mock latency in milliseconds for PostHog API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms

PRISMA_AUTH_RECONNECT_LOCK_TIMEOUT_SECONDS

Lock timeout in seconds for Prisma auth reconnection. Default is 0.1

PRISMA_AUTH_RECONNECT_TIMEOUT_SECONDS

Timeout in seconds for Prisma auth reconnection attempts. Default is 2.0

PRISMA_HEALTH_WATCHDOG_ENABLED

Enable the Prisma DB health watchdog that monitors and reconnects on connection loss. Default is true

PRISMA_HEALTH_WATCHDOG_INTERVAL_SECONDS

Interval in seconds for Prisma health watchdog probes. Default is 30

PRISMA_HEALTH_WATCHDOG_PROBE_TIMEOUT_SECONDS

Timeout in seconds for each Prisma health probe. Default is 5.0

PRISMA_RECONNECT_COOLDOWN_SECONDS

Cooldown in seconds between Prisma reconnection attempts. Default is 15

PRISMA_RECONNECT_ESCALATION_THRESHOLD

Number of consecutive reconnect failures before escalating the reconnection strategy. Default is 3

PRISMA_WATCHDOG_RECONNECT_TIMEOUT_SECONDS

Timeout in seconds for Prisma watchdog-initiated reconnection. Default is 30.0

PREDIBASE_API_BASE

Base URL for Predibase API

PRESIDIO_ANALYZER_API_BASE

Base URL for Presidio Analyzer service

PRESIDIO_ANONYMIZER_API_BASE

Base URL for Presidio Anonymizer service

PROMETHEUS_BUDGET_METRICS_REFRESH_INTERVAL_MINUTES

Refresh interval in minutes for Prometheus budget metrics. Default is 5

PROMETHEUS_FALLBACK_STATS_SEND_TIME_HOURS

Fallback time in hours for sending stats to Prometheus. Default is 9

PROMETHEUS_URL

URL for Prometheus service

PROMPTLAYER_API_KEY

API key for PromptLayer integration

PROXY_ADMIN_ID

Admin identifier for proxy server

PROXY_BASE_URL

Base URL for proxy service. Also used by the MCP OAuth authorize endpoint as the proxy's public origin when validating browser-supplied redirect_uri values — set this to the exact origin users see in their address bar (e.g. https://llm.example.com) when LiteLLM runs behind a TLS-terminating ingress. Full origin only: scheme + host (+ port if non-default), no trailing slash, no path. When set, it takes precedence over X-Forwarded-* headers (which only apply when use_x_forwarded_for is true AND the request peer is in mcp_trusted_proxy_ranges). See MCP OAuth — Reverse proxy and ingress configuration.

PROXY_BATCH_WRITE_AT

Time in seconds to wait before batch writing spend logs to the database. Default is 10

PROXY_BATCH_POLLING_INTERVAL

Time in seconds to wait before polling a batch, to check if it's completed. Default is 6000s (1 hour)

PROXY_BATCH_POLLING_ENABLED

Set to false to disable the CheckBatchCost and CheckResponsesCost background polling jobs entirely. Useful for emergency mitigation on installs with large numbers of stale managed objects. Default is true

MAX_OBJECTS_PER_POLL_CYCLE

Maximum number of managed objects (batches / responses) fetched per polling cycle. Prevents OOM on installs with many stale rows. Default is 50

MANAGED_OBJECT_STALENESS_CUTOFF_DAYS

Managed objects older than this many days in a non-terminal state are marked stale_expired at the start of each poll cycle and skipped. Default is 7

PROXY_BUDGET_RESCHEDULER_MAX_TIME

Maximum time in seconds to wait before checking database for budget resets. Default is 605

PROXY_BUDGET_RESCHEDULER_MIN_TIME

Minimum time in seconds to wait before checking database for budget resets. Default is 597

PYTHON_GC_THRESHOLD

GC thresholds ('gen0,gen1,gen2', e.g. '1000,50,50'); defaults to Python’s values.

PROXY_LOGOUT_URL

URL for logging out of the proxy service

QDRANT_API_BASE

Base URL for Qdrant API

QDRANT_API_KEY

API key for Qdrant service

QDRANT_SCALAR_QUANTILE

Scalar quantile for Qdrant operations. Default is 0.99

QDRANT_URL

Connection URL for Qdrant database

QDRANT_VECTOR_SIZE

Vector size for Qdrant operations. Default is 1536

REDIS_CONNECTION_POOL_TIMEOUT

Timeout in seconds for Redis connection pool. Default is 5

REDIS_CIRCUIT_BREAKER_ENABLED

When false, the Redis circuit breaker is disabled and never opens. Default is true

REDIS_CIRCUIT_BREAKER_FAILURE_THRESHOLD

Number of consecutive failures before the Redis circuit breaker opens. Default is 5

REDIS_CIRCUIT_BREAKER_RECOVERY_TIMEOUT

Time in seconds before the Redis circuit breaker attempts recovery after opening. Default is 60

REDIS_CLUSTER_NODES

JSON-formatted list of Redis cluster startup nodes for Redis Cluster mode. Example: [{"host": "node1", "port": 6379}]

REDIS_HOST

Hostname for Redis server

REDIS_PASSWORD

Password for Redis service

REDIS_PORT

Port number for Redis server

REDIS_SOCKET_TIMEOUT

Timeout in seconds for Redis socket operations. Default is 0.1

REDIS_GCP_SERVICE_ACCOUNT

GCP service account for IAM authentication with Redis. Format: "projects/-/serviceAccounts/name@project.iam.gserviceaccount.com"

REDIS_GCP_SSL_CA_CERTS

Path to SSL CA certificate file for secure GCP Memorystore Redis connections

REDOC_URL

The path to the Redoc Fast API documentation. By default this is "/redoc"

REPEATED_STREAMING_CHUNK_LIMIT

Limit for repeated streaming chunks to detect looping. Default is 100

REALTIME_WEBSOCKET_MAX_MESSAGE_SIZE_BYTES

Maximum size in bytes for WebSocket messages in realtime connections. Default is None.

REPLICATE_MODEL_NAME_WITH_ID_LENGTH

Length of Replicate model names with ID. Default is 64

REPLICATE_POLLING_DELAY_SECONDS

Delay in seconds for Replicate polling operations. Default is 0.5

REQUEST_TIMEOUT

Timeout in seconds for requests. Default is 6000

ROOT_REDIRECT_URL

URL to redirect root path (/) to when DOCS_URL is set to something other than "/" (DOCS_URL is "/" by default)

ROUTER_MAX_FALLBACKS

Maximum number of fallbacks for router. Default is 5

RUBRIK_API_KEY

Bearer token for authenticating with the Rubrik webhook service

RUBRIK_BATCH_SIZE

Number of log entries to buffer before flushing to Rubrik. Default is 512

RUBRIK_SAMPLING_RATE

Fraction of requests to log to Rubrik (0.0 to 1.0). Default is 1.0

RUBRIK_WEBHOOK_URL

Base URL of the Rubrik webhook service for tool blocking and batch logging

RUNWAYML_DEFAULT_API_VERSION

Default API version for RunwayML service. Default is "2024-11-06"

RUNWAYML_POLLING_TIMEOUT

Timeout in seconds for RunwayML image generation polling. Default is 600 (10 minutes)

S3_VECTORS_DEFAULT_DIMENSION

Default vector dimension for S3 Vectors RAG ingestion. Default is 1024

S3_VECTORS_DEFAULT_DISTANCE_METRIC

Default distance metric for S3 Vectors RAG ingestion. Options: "cosine", "euclidean". Default is "cosine"

SECRET_MANAGER_REFRESH_INTERVAL

Refresh interval in seconds for secret manager. Default is 86400 (24 hours)

SERVER_ROOT_PATH

Root path for the server application

SEND_USER_API_KEY_ALIAS

Flag to send user API key alias to Zscaler AI Guard. Default is False

SEND_USER_API_KEY_TEAM_ID

Flag to send user API key team ID to Zscaler AI Guard. Default is False

SEND_USER_API_KEY_USER_ID

Flag to send user API key user ID to Zscaler AI Guard. Default is False

SET_VERBOSE

[DEPRECATED] Use LITELLM_LOG instead with values "INFO", "DEBUG", or "ERROR". See debugging docs

SINGLE_DEPLOYMENT_TRAFFIC_FAILURE_THRESHOLD

Minimum number of requests to consider "reasonable traffic" for single-deployment cooldown logic. Default is 1000

SLACK_DAILY_REPORT_FREQUENCY

Frequency of daily Slack reports (e.g., daily, weekly)

SLACK_WEBHOOK_URL

Webhook URL for Slack integration

SMTP_HOST

Hostname for the SMTP server

SMTP_PASSWORD

Password for SMTP authentication (do not set if SMTP does not require auth)

SMTP_PORT

Port number for SMTP server

SMTP_SENDER_EMAIL

Email address used as the sender in SMTP transactions

SMTP_SENDER_LOGO

Logo used in emails sent via SMTP

SMTP_TLS

Flag to enable or disable TLS for SMTP connections

SMTP_USE_SSL

Set to "True" to force implicit SSL (SMTP_SSL) on any port. Not needed for port 465, which uses implicit SSL automatically; other ports use STARTTLS by default (see SMTP_TLS)

SMTP_USERNAME

Username for SMTP authentication (do not set if SMTP does not require auth)

SENDGRID_API_KEY

API key for SendGrid email service

RESEND_API_KEY

API key for Resend email service

SENDGRID_SENDER_EMAIL

Email address used as the sender in SendGrid email transactions

SPEND_LOGS_URL

URL for retrieving spend logs

SPEND_LOG_CLEANUP_BATCH_SIZE

Number of logs deleted per batch during cleanup. Default is 1000

STALE_OBJECT_CLEANUP_BATCH_SIZE

Max number of stale managed objects updated per cleanup cycle. Default is 1000

SSL_CERTIFICATE

Path to the SSL certificate file

SSL_ECDH_CURVE

ECDH curve for SSL/TLS key exchange (e.g., 'X25519' to disable PQC).

SSL_SECURITY_LEVEL

[BETA] Security level for SSL/TLS connections. E.g. DEFAULT@SECLEVEL=1

SSL_VERIFY

Flag to enable or disable SSL certificate verification

SSL_CERT_FILE

Path to the SSL certificate file for custom CA bundle

SUPABASE_KEY

API key for Supabase service

SUPABASE_URL

Base URL for Supabase instance

STORE_MODEL_IN_DB

If true, enables storing model + credential information in the DB.

SYSTEM_MESSAGE_TOKEN_COUNT

Token count for system messages. Default is 4

TEST_EMAIL_ADDRESS

Email address used for testing purposes

TOGETHER_AI_4_B

Size parameter for Together AI 4B model. Default is 4

TOGETHER_AI_8_B

Size parameter for Together AI 8B model. Default is 8

TOGETHER_AI_21_B

Size parameter for Together AI 21B model. Default is 21

TOGETHER_AI_41_B

Size parameter for Together AI 41B model. Default is 41

TOGETHER_AI_80_B

Size parameter for Together AI 80B model. Default is 80

TOGETHER_AI_110_B

Size parameter for Together AI 110B model. Default is 110

TOGETHER_AI_EMBEDDING_150_M

Size parameter for Together AI 150M embedding model. Default is 150

TOGETHER_AI_EMBEDDING_350_M

Size parameter for Together AI 350M embedding model. Default is 350

TOOL_CHOICE_OBJECT_TOKEN_COUNT

Token count for tool choice objects. Default is 4

TOOL_POLICY_CACHE_TTL_SECONDS

TTL in seconds for caching tool policy guardrail results. Default is 60

UI_LOGO_PATH

Path to the logo image used in the UI

UI_PASSWORD

Password for accessing the UI

UI_USERNAME

Username for accessing the UI

UPSTREAM_LANGFUSE_DEBUG

Flag to enable debugging for upstream Langfuse

UPSTREAM_LANGFUSE_HOST

Host URL for upstream Langfuse service

UPSTREAM_LANGFUSE_PUBLIC_KEY

Public key for upstream Langfuse authentication

UPSTREAM_LANGFUSE_RELEASE

Release version identifier for upstream Langfuse

UPSTREAM_LANGFUSE_SECRET_KEY

Secret key for upstream Langfuse authentication

USE_AWS_KMS

Flag to enable AWS Key Management Service for encryption

USE_PRISMA_MIGRATE

Flag to use prisma migrate instead of prisma db push. Recommended for production environments.

VANTAGE_API_KEY

API key for Vantage cost-import integration

VANTAGE_BASE_URL

Base URL for Vantage API. Default is https://api.vantage.sh

VANTAGE_EXPORT_FREQUENCY

Export frequency for Vantage — hourly (default), daily, or interval

VANTAGE_EXPORT_INTERVAL_SECONDS

Interval in seconds when VANTAGE_EXPORT_FREQUENCY is interval

VANTAGE_INTEGRATION_TOKEN

Vantage integration token for the cost-import endpoint

WANDB_API_KEY

API key for Weights & Biases (W&B) logging integration

WANDB_HOST

Host URL for Weights & Biases (W&B) service

WANDB_PROJECT_ID

Project ID for Weights & Biases (W&B) logging integration

WEBHOOK_URL

URL for receiving webhooks from external services

SPEND_LOG_RUN_LOOPS

Constant for setting how many runs of 1000 batch deletes should spend_log_cleanup task run

SPEND_LOG_CLEANUP_BATCH_SIZE

Number of logs deleted per batch during cleanup. Default is 1000

SPEND_LOG_PARTITION_INTERVAL

Granularity of LiteLLM_SpendLogs partitions when the table is partitioned: day, week, or month. Default is day

SPEND_LOG_PARTITION_PRECREATE_AHEAD

Number of future spend-log partitions to pre-create on each cleanup run. Default is 7

SPEND_LOG_QUEUE_POLL_INTERVAL

Polling interval in seconds for spend log queue. Default is 2.0

SPEND_LOG_QUEUE_SIZE_THRESHOLD

Threshold for spend log queue size before processing. Default is 100

SPEND_LOG_CLEANUP_MAX_CONSECUTIVE_BATCH_FAILURES

Number of consecutive batch failures tolerated before the spend log cleanup run aborts. Default is 3

SPEND_LOG_CLEANUP_BATCH_FAILURE_BACKOFF_SECONDS

Backoff in seconds between failed spend log cleanup batches. Default is 0.5

SPEND_COUNTER_RESEED_LOCKS_MAX_SIZE

Max size of the per-counter LRU lock dict used to coalesce concurrent spend-counter reseeds from the DB on the enforcement path. Default is 10000.

COROUTINE_CHECKER_MAX_SIZE_IN_MEMORY

Maximum size for CoroutineChecker in-memory cache. Default is 1000

DEFAULT_SHARED_HEALTH_CHECK_TTL

Time-to-live in seconds for cached health check results in shared health check mode. Default is 300 (5 minutes)

DEFAULT_SHARED_HEALTH_CHECK_LOCK_TTL

Time-to-live in seconds for health check lock in shared health check mode. Default is 60 (1 minute)

ZSCALER_AI_GUARD_API_KEY

API key for Zscaler AI Guard service

ZSCALER_AI_GUARD_POLICY_ID

Policy ID for Zscaler AI Guard guardrails

ZSCALER_AI_GUARD_URL

Base URL for Zscaler AI Guard API. Default is https://api.us1.zseclipse.net/v1/detection/execute-policy