config_settings | liteLLM (original) (raw)
ACTIONS_ID_TOKEN_REQUEST_TOKEN
Token for requesting ID in GitHub Actions
ACTIONS_ID_TOKEN_REQUEST_URL
URL for requesting ID token in GitHub Actions
AGENTOPS_ENVIRONMENT
Environment for AgentOps logging integration
AGENTOPS_API_KEY
API Key for AgentOps logging integration
AGENTOPS_SERVICE_NAME
Service Name for AgentOps logging integration
AISPEND_ACCOUNT_ID
Account ID for AI Spend
AISPEND_API_KEY
API Key for AI Spend
AIOHTTP_CONNECTOR_LIMIT
Connection limit for aiohttp connector. When set to 0, no limit is applied. Default is 0
AIOHTTP_CONNECTOR_LIMIT_PER_HOST
Connection limit per host for aiohttp connector. When set to 0, no limit is applied. Default is 0
AIOHTTP_KEEPALIVE_TIMEOUT
Keep-alive timeout for aiohttp connections in seconds. Default is 120
AIOHTTP_SO_KEEPALIVE
Enable TCP SO_KEEPALIVE on aiohttp sockets so idle provider connections are detected and reaped before NAT/load balancers silently drop them. Default is False
AIOHTTP_TCP_KEEPCNT
Number of unacknowledged TCP keepalive probes before the connection is considered dead (applies when AIOHTTP_SO_KEEPALIVE=True). Default is 5
AIOHTTP_TCP_KEEPIDLE
Seconds an aiohttp TCP connection must be idle before keepalive probes are sent (applies when AIOHTTP_SO_KEEPALIVE=True). Default is 60
AIOHTTP_TCP_KEEPINTVL
Seconds between successive aiohttp TCP keepalive probes (applies when AIOHTTP_SO_KEEPALIVE=True). Default is 30
AIOHTTP_TRUST_ENV
Flag to enable aiohttp trust environment. When this is set to True, aiohttp will respect HTTP(S)_PROXY env vars. Default is False
AIOHTTP_TTL_DNS_CACHE
DNS cache time-to-live for aiohttp in seconds. Default is 300
AKTO_GUARDRAIL_API_BASE
Base URL for the Akto Guardrail API (e.g. http://localhost:9090). Used by the Akto guardrail integration.
AKTO_API_KEY
API key for authenticating with the Akto Guardrail service.
ALLOWED_EMAIL_DOMAINS
List of email domains allowed for access
APSCHEDULER_COALESCE
Whether to combine multiple pending executions of a job into one. Default is False
APSCHEDULER_MAX_INSTANCES
Maximum number of concurrent instances of each job. Default is 1
APSCHEDULER_MISFIRE_GRACE_TIME
Grace time in seconds for misfired jobs. Default is 1
APSCHEDULER_REPLACE_EXISTING
Whether to replace existing jobs with the same ID. Default is False
ARIZE_API_KEY
API key for Arize platform integration
ARIZE_SPACE_KEY
Space key for Arize platform
ARGILLA_BATCH_SIZE
Batch size for Argilla logging
ARGILLA_API_KEY
API key for Argilla platform
ARGILLA_SAMPLING_RATE
Sampling rate for Argilla logging
ARGILLA_DATASET_NAME
Dataset name for Argilla logging
ARGILLA_BASE_URL
Base URL for Argilla service
ATHINA_API_KEY
API key for Athina service
ATHINA_BASE_URL
Base URL for Athina service (defaults to https://log.athina.ai)
AUTH_STRATEGY
Strategy used for authentication (e.g., OAuth, API key)
AUTO_REDIRECT_UI_LOGIN_TO_SSO
Flag to enable automatic redirect of UI login page to SSO when SSO is configured. Default is false
AUDIO_SPEECH_CHUNK_SIZE
Chunk size for audio speech processing. Default is 1024
ANTHROPIC_API_KEY
API key for Anthropic service. Uses x-api-key header for authentication.
ANTHROPIC_AUTH_TOKEN
Alternative auth token for Anthropic service. Uses Authorization: Bearer header instead of x-api-key. Used as fallback when ANTHROPIC_API_KEY is not set.
ANTHROPIC_API_BASE
Base URL for Anthropic API. Default is https://api.anthropic.com
ANTHROPIC_BASE_URL
Alternative to ANTHROPIC_API_BASE for setting the Anthropic API base URL. Used as fallback when ANTHROPIC_API_BASE is not set.
ANTHROPIC_TOKEN_COUNTING_BETA_VERSION
Beta version header for Anthropic token counting API. Default is token-counting-2024-11-01
AWS_ACCESS_KEY_ID
Access Key ID for AWS services
AWS_BATCH_ROLE_ARN
ARN of the AWS IAM role for batch operations
AWS_DEFAULT_REGION
Default AWS region for service interactions when AWS_REGION is not set
AWS_PROFILE_NAME
AWS CLI profile name to be used
AWS_REGION
AWS region for service interactions (takes precedence over AWS_DEFAULT_REGION)
AWS_REGION_NAME
Default AWS region for service interactions
AWS_ROLE_ARN
ARN of the AWS IAM role to assume for authentication
AWS_ROLE_NAME
Role name for AWS IAM usage
AWS_S3_BUCKET_NAME
Name of the AWS S3 bucket for file operations
AWS_S3_OUTPUT_BUCKET_NAME
Name of the AWS S3 output bucket for batch operations
AWS_SECRET_ACCESS_KEY
Secret Access Key for AWS services
AWS_SESSION_NAME
Name for AWS session
AWS_WEB_IDENTITY_TOKEN
Web identity token for AWS
AWS_WEB_IDENTITY_TOKEN_FILE
Path to file containing web identity token for AWS
AZURE_API_VERSION
Version of the Azure API being used
AZURE_AI_API_BASE
Base URL for Azure AI services (e.g., Azure AI Anthropic)
AZURE_AI_API_KEY
API key for Azure AI services (e.g., Azure AI Anthropic)
AZURE_AUTHORITY_HOST
Azure authority host URL
AZURE_CERTIFICATE_PASSWORD
Password for Azure OpenAI certificate
AZURE_CLIENT_ID
Client ID for Azure services
AZURE_CLIENT_SECRET
Client secret for Azure services
AZURE_COMPUTER_USE_INPUT_COST_PER_1K_TOKENS
Input cost per 1K tokens for Azure Computer Use service
AZURE_COMPUTER_USE_OUTPUT_COST_PER_1K_TOKENS
Output cost per 1K tokens for Azure Computer Use service
AZURE_DEFAULT_RESPONSES_API_VERSION
Version of the Azure Default Responses API being used. Default is "preview"
AZURE_DOCUMENT_INTELLIGENCE_API_VERSION
API version for Azure Document Intelligence service
AZURE_DOCUMENT_INTELLIGENCE_DEFAULT_DPI
Default DPI (dots per inch) setting for Azure Document Intelligence service
AZURE_TENANT_ID
Tenant ID for Azure Active Directory
AZURE_USERNAME
Username for Azure services, use in conjunction with AZURE_PASSWORD for azure ad token with basic username/password workflow
AZURE_PASSWORD
Password for Azure services, use in conjunction with AZURE_USERNAME for azure ad token with basic username/password workflow
AZURE_FEDERATED_TOKEN_FILE
File path to Azure federated token
AZURE_FILE_SEARCH_COST_PER_GB_PER_DAY
Cost per GB per day for Azure File Search service
AZURE_SCOPE
For EntraID Auth, Scope for Azure services, defaults to "https://cognitiveservices.azure.com/.default"
AZURE_SENTINEL_DCR_IMMUTABLE_ID
Immutable ID of the Data Collection Rule for Azure Sentinel logging
AZURE_SENTINEL_STREAM_NAME
Stream name for Azure Sentinel logging
AZURE_SENTINEL_CLIENT_SECRET
Client secret for Azure Sentinel authentication
AZURE_SENTINEL_ENDPOINT
Endpoint for Azure Sentinel logging
AZURE_SENTINEL_TENANT_ID
Tenant ID for Azure Sentinel authentication
AZURE_SENTINEL_CLIENT_ID
Client ID for Azure Sentinel authentication
AZURE_KEY_VAULT_URI
URI for Azure Key Vault
AZURE_OPERATION_POLLING_TIMEOUT
Timeout in seconds for Azure operation polling
AZURE_STORAGE_ACCOUNT_KEY
The Azure Storage Account Key to use for Authentication to Azure Blob Storage logging
AZURE_STORAGE_ACCOUNT_NAME
Name of the Azure Storage Account to use for logging to Azure Blob Storage
AZURE_STORAGE_FILE_SYSTEM
Name of the Azure Storage File System to use for logging to Azure Blob Storage. (Typically the Container name)
AZURE_STORAGE_TENANT_ID
The Application Tenant ID to use for Authentication to Azure Blob Storage logging
AZURE_STORAGE_CLIENT_ID
The Application Client ID to use for Authentication to Azure Blob Storage logging
AZURE_STORAGE_CLIENT_SECRET
The Application Client Secret to use for Authentication to Azure Blob Storage logging
AZURE_VECTOR_STORE_COST_PER_GB_PER_DAY
Cost per GB per day for Azure Vector Store service
BACKGROUND_HEALTH_CHECK_MAX_TOKENS
Optional global default for max_tokens on proxy background health checks when a model has no health_check_max_tokens. If unset, non-wildcard models default to 5. Applies to wildcard routes when set. Default is unset
BACKGROUND_HEALTH_CHECK_MAX_TOKENS_REASONING
For non-wildcard reasoning models (supports_reasoning(model)=true), this takes precedence over BACKGROUND_HEALTH_CHECK_MAX_TOKENS when set. If unset, reasoning models fall back to BACKGROUND_HEALTH_CHECK_MAX_TOKENS (if set) or default behavior. Wildcard routes ignore this. Default is unset
BATCH_STATUS_POLL_INTERVAL_SECONDS
Interval in seconds for polling batch status. Default is 3600 (1 hour)
BATCH_STATUS_POLL_MAX_ATTEMPTS
Maximum number of attempts for polling batch status. Default is 24 (for 24 hours)
BEDROCK_MAX_POLICY_SIZE
Maximum size for Bedrock policy. Default is 75
BEDROCK_MIN_THINKING_BUDGET_TOKENS
Minimum thinking budget in tokens for Bedrock reasoning models. Bedrock returns a 400 error if budget_tokens is below this value. Requests with lower values are clamped to this minimum. Default is 1024
BERRISPEND_ACCOUNT_ID
Account ID for BerriSpend service
BRAINTRUST_API_KEY
API key for Braintrust integration
BRAINTRUST_API_BASE
Base URL for Braintrust API. Default is https://api.braintrustdata.com/v1
BRAINTRUST_MOCK
Enable mock mode for Braintrust integration testing. When set to true, intercepts Braintrust API calls and returns mock responses without making actual network calls. Default is false
BRAINTRUST_MOCK_LATENCY_MS
Mock latency in milliseconds for Braintrust API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms
CACHED_STREAMING_CHUNK_DELAY
Delay in seconds for cached streaming chunks. Default is 0.02
CHATGPT_API_BASE
Base URL for ChatGPT API. Default is https://chatgpt.com/backend-api/codex
CHATGPT_AUTH_FILE
Filename for ChatGPT authentication data. Default is "auth.json"
CHATGPT_DEFAULT_INSTRUCTIONS
Default system instructions for ChatGPT provider
CHATGPT_ORIGINATOR
Originator identifier for ChatGPT API requests. Default is "codex_cli_rs"
CHATGPT_TOKEN_DIR
Directory to store ChatGPT authentication tokens. Default is "~/.config/litellm/chatgpt"
CHATGPT_USER_AGENT
Custom user agent string for ChatGPT API requests
CHATGPT_USER_AGENT_SUFFIX
Suffix to append to the ChatGPT user agent string
CIRCLE_OIDC_TOKEN
OpenID Connect token for CircleCI
CIRCLE_OIDC_TOKEN_V2
Version 2 of the OpenID Connect token for CircleCI
CLI_JWT_EXPIRATION_HOURS
Expiration time in hours for CLI-generated JWT tokens. Default is 24 hours. Can also be set via LITELLM_CLI_JWT_EXPIRATION_HOURS
CLI_SSO_CLAIM_MAP
Comma-separated allowlist mapping OIDC claim paths to LiteLLM user metadata keys for CLI SSO (e.g. employment_type->acme_employment_type,org_info.department->department). Scalar values are also returned in /sso/cli/poll as attribution_metadata. Alias: LITELLM_CLI_SSO_CLAIM_MAP
CLOUDZERO_API_KEY
CloudZero API key for authentication
CLOUDZERO_CONNECTION_ID
CloudZero connection ID for data submission
CLOUDZERO_EXPORT_INTERVAL_MINUTES
Interval in minutes for CloudZero data export operations
CLOUDZERO_MAX_FETCHED_DATA_RECORDS
Maximum number of data records to fetch from CloudZero
CLOUDZERO_TIMEZONE
Timezone for date handling (default: UTC)
CONFIG_FILE_PATH
File path for configuration file
CYBERARK_ACCOUNT
CyberArk account name for secret management
CYBERARK_API_BASE
Base URL for CyberArk API
CYBERARK_API_KEY
API key for CyberArk secret management service
CYBERARK_CLIENT_CERT
Path to client certificate for CyberArk authentication
CYBERARK_CLIENT_KEY
Path to client key for CyberArk authentication
CYBERARK_USERNAME
Username for CyberArk authentication
CYBERARK_SSL_VERIFY
Flag to enable or disable SSL certificate verification for CyberArk. Default is True
CONFIDENT_API_KEY
API key for DeepEval integration
CUSTOM_TIKTOKEN_CACHE_DIR
Custom directory for Tiktoken cache
CONFIDENT_API_KEY
API key for Confident AI (Deepeval) Logging service
COHERE_API_BASE
Base URL for Cohere API. Default is https://api.cohere.com
COMPETITOR_LLM_TEMPERATURE
Temperature setting for the LLM used in competitor discovery. Default is 0.3
CURSOR_API_BASE
API base URL for Cursor AI provider integration. Default is https://api.cursor.com
DATABASE_HOST
Hostname for the database server
DATABASE_HOST_READ_REPLICA
Hostname for the read-replica database server. Only used by the componentized deployment (experimental) when IAM_TOKEN_DB_AUTH=True to assemble DATABASE_URL_READ_REPLICA from RDS IAM env vars
DATABASE_NAME
Name of the database
DATABASE_NAME_READ_REPLICA
Database name for the read replica (defaults to DATABASE_NAME). Only used by the componentized deployment (experimental) when IAM_TOKEN_DB_AUTH=True
DATABASE_PASSWORD
Password for the database user
DATABASE_PORT
Port number for database connection
DATABASE_PORT_READ_REPLICA
Port number for the read replica (default 5432). Only used by the componentized deployment (experimental) when IAM_TOKEN_DB_AUTH=True
DATABASE_SCHEMA
Schema name used in the database
DATABASE_SCHEMA_READ_REPLICA
Schema name for the read replica (defaults to DATABASE_SCHEMA). Only used by the componentized deployment (experimental) when IAM_TOKEN_DB_AUTH=True
DATABASE_URL
Connection URL for the database
DATABASE_URL_READ_REPLICA
Optional read-replica connection URL. When set, the proxy routes read-only queries (find_*, count, group_by, query_raw/_first) to this endpoint while writes continue to use DATABASE_URL. Useful for Aurora-style clusters with separate reader/writer endpoints. Falls back to writer-only behavior when unset. With IAM_TOKEN_DB_AUTH=True, the reader IAM token is auto-refreshed alongside the writer
DATABASE_USER
Username for database connection
DATABASE_USER_READ_REPLICA
Database user for the read replica (defaults to DATABASE_USER). Only used by the componentized deployment (experimental) when IAM_TOKEN_DB_AUTH=True
DATABASE_USERNAME
Alias for database user
DATABRICKS_API_BASE
Base URL for Databricks API
DATABRICKS_API_KEY
API key (Personal Access Token) for Databricks API authentication
DATABRICKS_CLIENT_ID
Client ID for Databricks OAuth M2M authentication (Service Principal application ID)
DATABRICKS_CLIENT_SECRET
Client secret for Databricks OAuth M2M authentication
DATABRICKS_USER_AGENT
Custom user agent string for Databricks API requests. Used for partner telemetry attribution
DAYS_IN_A_MONTH
Days in a month for calculation purposes. Default is 28
DAYS_IN_A_WEEK
Days in a week for calculation purposes. Default is 7
DAYS_IN_A_YEAR
Days in a year for calculation purposes. Default is 365
DRAIN_ENDPOINT_TOKEN
Shared secret required on the X-Drain-Token header to call the /health/drain endpoint. When set (here or via general_settings.drain_endpoint_token), drain calls without the matching token are rejected with 401; when unset the endpoint keeps its opt-in-only behavior. Have the kubelet send it from the preStop httpGet.httpHeaders.
DYNAMOAI_API_KEY
API key for DynamoAI Guardrails service
DYNAMOAI_API_BASE
Base URL for DynamoAI API. Default is https://api.dynamo.ai
DYNAMOAI_MODEL_ID
Model ID for DynamoAI tracking/logging purposes
DYNAMOAI_POLICY_IDS
Comma-separated list of DynamoAI policy IDs to apply
DD_BASE_URL
Base URL for Datadog integration
DATADOG_BASE_URL
(Alternative to DD_BASE_URL) Base URL for Datadog integration
_DATADOG_BASE_URL
(Alternative to DD_BASE_URL) Base URL for Datadog integration
DD_AGENT_HOST
Hostname or IP of DataDog agent (e.g., "localhost"). When set, logs are sent to agent instead of direct API
DD_AGENT_PORT
Port of DataDog agent for log intake. Default is 10518
DD_API_KEY
API key for Datadog integration
DD_APP_KEY
Application key for Datadog Cost Management integration. Required along with DD_API_KEY for cost metrics
DD_BATCH_SIZE
Number of log events buffered before flushing to Datadog. Clamped to [1, 1000]; defaults to 1000. Lower it (e.g. 50) if batches exceed Datadog's 5MB request limit
DD_SITE
Site URL for Datadog (e.g., datadoghq.com)
DD_SOURCE
Source identifier for Datadog logs
DD_TRACER_STREAMING_CHUNK_YIELD_RESOURCE
Resource name for Datadog tracing of streaming chunk yields. Default is "streaming.chunk.yield"
DD_ENV
Environment identifier for Datadog logs. Only supported for datadog_llm_observability callback
DD_LLMOBS_ML_APP
Default ml_app name for Datadog LLM Observability (Application column). Falls back to DD_SERVICE. Can be overridden per-request via metadata.ml_app.
DD_SERVICE
Service identifier for Datadog logs. Defaults to "litellm-server"
DD_VERSION
Version identifier for Datadog logs. Defaults to "unknown"
DATADOG_MOCK
Enable mock mode for Datadog integration testing. When set to true, intercepts Datadog API calls and returns mock responses without making actual network calls. Default is false
DATADOG_MOCK_LATENCY_MS
Mock latency in milliseconds for Datadog API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms
DEBUG_OTEL
Enable debug mode for OpenTelemetry
DEFAULT_ALLOWED_FAILS
Maximum failures allowed before cooling down a model. Default is 3
DEFAULT_A2A_AGENT_TIMEOUT
Default timeout in seconds for A2A (Agent-to-Agent) protocol requests. Default is 6000
DEFAULT_ACCESS_GROUP_CACHE_TTL
Time-to-live in seconds for cached access group information. Default is 600 (10 minutes)
DEFAULT_ANTHROPIC_CHAT_MAX_TOKENS
Default maximum tokens for Anthropic chat completions. Default is 4096
DEFAULT_BATCH_SIZE
Default batch size for operations. Default is 512
DEFAULT_CHUNK_OVERLAP
Default chunk overlap for RAG text splitters. Default is 200
DEFAULT_CHUNK_SIZE
Default chunk size for RAG text splitters. Default is 1000
DEFAULT_CLIENT_DISCONNECT_CHECK_TIMEOUT_SECONDS
Timeout in seconds for checking client disconnection. Default is 1
DEFAULT_COOLDOWN_TIME_SECONDS
Duration in seconds to cooldown a model after failures. Default is 5
DEFAULT_CRON_JOB_LOCK_TTL_SECONDS
Time-to-live for cron job locks in seconds. Default is 60 (1 minute)
DEFAULT_DATAFORSEO_LOCATION_CODE
Default location code for DataForSEO search API. Default is 2250 (France)
DEFAULT_FAILURE_THRESHOLD_PERCENT
Threshold percentage of failures to cool down a deployment. Default is 0.5 (50%)
DEFAULT_FAILURE_THRESHOLD_MINIMUM_REQUESTS
Minimum number of requests before applying error rate cooldown. Prevents cooldown from triggering on first failure. Default is 5
DEFAULT_FLUSH_INTERVAL_SECONDS
Default interval in seconds for flushing operations. Default is 5
DEFAULT_HEALTH_CHECK_INTERVAL
Default interval in seconds for health checks. Default is 300 (5 minutes)
DEFAULT_HEALTH_CHECK_PROMPT
Default prompt used during health checks for non-image models. Default is "test from litellm"
DEFAULT_IMAGE_HEIGHT
Default height for images. Default is 300
DEFAULT_IMAGE_TOKEN_COUNT
Default token count for images. Default is 250
DEFAULT_IMAGE_WIDTH
Default width for images. Default is 300
DEFAULT_IN_MEMORY_TTL
Default time-to-live for in-memory cache in seconds. Default is 5
DEFAULT_MANAGEMENT_OBJECT_IN_MEMORY_CACHE_TTL
Default time-to-live in seconds for management objects (User, Team, Key, Organization) in memory cache. Default is 60 seconds.
DEFAULT_MAX_LRU_CACHE_SIZE
Default maximum size for LRU cache. Default is 64
DEFAULT_MAX_RECURSE_DEPTH
Default maximum recursion depth. Default is 100
DEFAULT_MAX_RECURSE_DEPTH_SENSITIVE_DATA_MASKER
Default maximum recursion depth for sensitive data masker. Default is 10
DEFAULT_MAX_RETRIES
Default maximum retry attempts. Default is 2
DEFAULT_MAX_TOKENS
Default maximum tokens for LLM calls. Default is 4096
DEFAULT_MAX_TOKENS_FOR_TRITON
Default maximum tokens for Triton models. Default is 2000
DEFAULT_MAX_REDIS_BATCH_CACHE_SIZE
Default maximum size for redis batch cache. Default is 1000
DEFAULT_MCP_SEMANTIC_FILTER_EMBEDDING_MODEL
Default embedding model for MCP semantic tool filtering. Default is "text-embedding-3-small"
DEFAULT_MCP_SEMANTIC_FILTER_SIMILARITY_THRESHOLD
Default similarity threshold for MCP semantic tool filtering. Default is 0.3
DEFAULT_MCP_SEMANTIC_FILTER_TOP_K
Default number of top results to return for MCP semantic tool filtering. Default is 10
MCP_NPM_CACHE_DIR
Directory for npm cache used by STDIO MCP servers. In containers the default (~/.npm) may not exist or be read-only. Default is /tmp/.npm_mcp_cache
LITELLM_MCP_CLIENT_TIMEOUT
MCP client connection timeout in seconds (stdio and HTTP/SSE transports). Default is 60
LITELLM_MCP_TOOL_LISTING_TIMEOUT
Timeout in seconds for listing tools from an MCP server. Default is 30
LITELLM_MCP_METADATA_TIMEOUT
HTTP client timeout in seconds for OAuth metadata fetching. Default is 10
LITELLM_MCP_HEALTH_CHECK_TIMEOUT
Health check timeout in seconds for MCP servers. Default is 10
LITELLM_MCP_STDIO_EXTRA_COMMANDS
Comma-separated extra command basenames allowed for MCP stdio transport beyond the built-in allowlist. Example: my-mcp-bin. Empty by default
MCP_OAUTH2_TOKEN_CACHE_DEFAULT_TTL
Default TTL in seconds for MCP OAuth2 token cache. Default is 3600
MCP_OAUTH2_TOKEN_CACHE_MAX_SIZE
Maximum number of entries in MCP OAuth2 token cache. Default is 200
MCP_OAUTH2_TOKEN_CACHE_MIN_TTL
Minimum TTL in seconds for MCP OAuth2 token cache. Default is 10
MCP_OAUTH2_TOKEN_EXPIRY_BUFFER_SECONDS
Seconds to subtract from token expiry when computing cache TTL. Default is 60
MCP_PER_USER_TOKEN_DEFAULT_TTL
Default TTL in seconds for per-user MCP OAuth tokens stored in Redis. Default is 43200 (12 hours)
MCP_PER_USER_TOKEN_EXPIRY_BUFFER_SECONDS
Seconds to subtract from per-user MCP OAuth token expiry when computing Redis TTL. Default is 60
MCP_TOKEN_EXCHANGE_CACHE_MAX_SIZE
Maximum number of entries in the MCP OAuth2 token exchange cache. Default is 500
MCP_TRUSTED_REDIRECT_ORIGINS
Comma-separated allowlist of additional redirect_uri origins accepted by the MCP OAuth authorize endpoint, beyond same-origin and loopback. Each entry is host or host:port; a *.suffix prefix matches any strictly-deeper subdomain. HTTPS only. Use this for first-party OAuth clients on sister domains (e.g. app.example.com). For ingressed deployments where the proxy's own origin is wrong, set PROXY_BASE_URL instead. See MCP OAuth — Reverse proxy and ingress configuration.
DEFAULT_MOCK_RESPONSE_COMPLETION_TOKEN_COUNT
Default token count for mock response completions. Default is 20
DEFAULT_MOCK_RESPONSE_PROMPT_TOKEN_COUNT
Default token count for mock response prompts. Default is 10
DEFAULT_MODEL_CREATED_AT_TIME
Default creation timestamp for models. Default is 1677610602
DEFAULT_NUM_WORKERS_LITELLM_PROXY
Default number of workers for LiteLLM proxy when NUM_WORKERS is not set. Default is 1. We strongly recommend setting NUM_WORKERS to the number of vCPUs available (e.g. NUM_WORKERS=8 or --num_workers 8).
DEFAULT_PROMPT_INJECTION_SIMILARITY_THRESHOLD
Default threshold for prompt injection similarity. Default is 0.7
DEFAULT_POLLING_INTERVAL
Default polling interval for schedulers in seconds. Default is 0.03
DEFAULT_REASONING_EFFORT_DISABLE_THINKING_BUDGET
Default reasoning effort disable thinking budget. Default is 0
DEFAULT_REASONING_EFFORT_HIGH_THINKING_BUDGET
Default high reasoning effort thinking budget. Default is 4096
DEFAULT_REASONING_EFFORT_LOW_THINKING_BUDGET
Default low reasoning effort thinking budget. Default is 1024
DEFAULT_REASONING_EFFORT_MAX_THINKING_BUDGET
Default max reasoning effort thinking budget for legacy Anthropic models that use thinking.budget_tokens (Claude 4.5 series + Haiku). On Claude 4.6/4.7 the max tier is routed via adaptive output_config.effort=max instead and ignores this constant. Default is 16384
DEFAULT_REASONING_EFFORT_MEDIUM_THINKING_BUDGET
Default medium reasoning effort thinking budget. Default is 2048
DEFAULT_REASONING_EFFORT_MINIMAL_THINKING_BUDGET
Default minimal reasoning effort thinking budget. Default is 512
DEFAULT_REASONING_EFFORT_MINIMAL_THINKING_BUDGET_GEMINI_2_5_FLASH
Default minimal reasoning effort thinking budget for Gemini 2.5 Flash. Default is 512
DEFAULT_REASONING_EFFORT_MINIMAL_THINKING_BUDGET_GEMINI_2_5_FLASH_LITE
Default minimal reasoning effort thinking budget for Gemini 2.5 Flash Lite. Default is 512
DEFAULT_REASONING_EFFORT_MINIMAL_THINKING_BUDGET_GEMINI_2_5_PRO
Default minimal reasoning effort thinking budget for Gemini 2.5 Pro. Default is 512
DEFAULT_REASONING_EFFORT_XHIGH_THINKING_BUDGET
Default xhigh reasoning effort thinking budget for legacy Anthropic models that use thinking.budget_tokens. Continues the 2× progression 1024 → 2048 → 4096 → 8192 from low/medium/high. On Claude 4.6/4.7 the xhigh tier is routed via adaptive output_config.effort=xhigh instead and ignores this constant. Default is 8192
DEFAULT_REDIS_MAJOR_VERSION
Default Redis major version to assume when version cannot be determined. Default is 7
DEFAULT_REDIS_SYNC_INTERVAL
Default Redis synchronization interval in seconds. Default is 1
DEFAULT_SEMANTIC_GUARD_EMBEDDING_MODEL
Default embedding model for Semantic Guard (route-matching guardrail). Default is "text-embedding-3-small"
DEFAULT_SEMANTIC_GUARD_SIMILARITY_THRESHOLD
Default similarity threshold for Semantic Guard route matching. Default is 0.75
DEFAULT_REPLICATE_GPU_PRICE_PER_SECOND
Default price per second for Replicate GPU. Default is 0.001400
DEFAULT_REPLICATE_POLLING_DELAY_SECONDS
Default delay in seconds for Replicate polling. Default is 1
DEFAULT_REPLICATE_POLLING_RETRIES
Default number of retries for Replicate polling. Default is 5
DEFAULT_SQS_BATCH_SIZE
Default batch size for SQS logging. Default is 512
DEFAULT_SQS_FLUSH_INTERVAL_SECONDS
Default flush interval for SQS logging. Default is 10
DEFAULT_S3_BATCH_SIZE
Default batch size for S3 logging. Default is 512
DEFAULT_S3_FLUSH_INTERVAL_SECONDS
Default flush interval for S3 logging. Default is 10
DEFAULT_SLACK_ALERTING_THRESHOLD
Default threshold for Slack alerting. Default is 300
DEFAULT_SOFT_BUDGET
Default soft budget for LiteLLM proxy keys. Default is 50.0
DEFAULT_TRIM_RATIO
Default ratio of tokens to trim from prompt end. Default is 0.75
DEFAULT_GOOGLE_VIDEO_DURATION_SECONDS
Default duration for video generation in seconds in google. Default is 8
DIRECT_URL
Direct URL for service endpoint
DISABLE_ADMIN_UI
Toggle to disable the admin UI
LITELLM_HIDE_DEFAULT_CREDENTIALS_HINT
Flag to hide the "Default Credentials" info card on the admin UI login page (/ui/login and /fallback/login). Useful when UI credentials are managed via UI_USERNAME / UI_PASSWORD or SSO and the hardcoded hint about admin + MASTER_KEY becomes misleading or is flagged by security scanners. Default is false
DISABLE_AIOHTTP_TRANSPORT
Flag to disable aiohttp transport. When this is set to True, litellm will use httpx instead of aiohttp. Default is False
DISABLE_AIOHTTP_TRUST_ENV
Flag to disable aiohttp trust environment. When this is set to True, litellm will not trust the environment for aiohttp eg. HTTP_PROXY and HTTPS_PROXY environment variables will not be used when this is set to True. Default is False
DISABLE_SCHEMA_UPDATE
Toggle to disable schema updates
DYNAMIC_RATE_LIMIT_ERROR_THRESHOLD_PER_MINUTE
Threshold for deployment failures per minute before enforcing rate limits in parallel request limiter. Default is 1
DOCS_DESCRIPTION
Description text for documentation pages
DOCS_FILTERED
Flag indicating filtered documentation
DOCS_TITLE
Title of the documentation pages
DOCS_URL
The path to the Swagger API documentation. By default this is "/"
EMAIL_LOGO_URL
URL for the logo used in emails
EMAIL_BUDGET_ALERT_TTL
Time-to-live for email budget alerts in seconds
EMAIL_BUDGET_ALERT_MAX_SPEND_ALERT_PERCENTAGE
Maximum spend percentage for triggering email budget alerts
EMAIL_SUPPORT_CONTACT
Support contact email address
EMAIL_SIGNATURE
Custom HTML footer/signature for all emails. Can include HTML tags for formatting and links.
EMAIL_SUBJECT_INVITATION
Custom subject template for invitation emails.
EMAIL_SUBJECT_KEY_CREATED
Custom subject template for key creation emails.
EMAIL_BUDGET_ALERT_MAX_SPEND_ALERT_PERCENTAGE
Percentage of max budget that triggers alerts (as decimal: 0.8 = 80%). Default is 0.8
EMAIL_BUDGET_ALERT_TTL
Time-to-live for budget alert deduplication in seconds. Default is 86400 (24 hours)
ENKRYPTAI_API_BASE
Base URL for EnkryptAI Guardrails API. Default is https://api.enkryptai.com
ENKRYPTAI_API_KEY
API key for EnkryptAI Guardrails service
FAROS_API_KEY
API key for sending LLM usage data to Faros AI
FAROS_API_URL
Base URL for the Faros AI API. Default is https://prod.api.faros.ai
FAROS_GRAPH
Faros graph that LiteLLM usage data is written to. Default is "default"
FAROS_ORIGIN
Origin recorded on rows written to Faros by LiteLLM. Default is "litellm"
FAROS_TOOL_CATEGORY
Tool category recorded on Faros vcs_UserTool rows. Default is "LiteLLM"
FAROS_USER_SOURCE
Source recorded on Faros vcs_User rows for LiteLLM users. Default is "LiteLLM"
FIREWORKS_AI_4_B
Size parameter for Fireworks AI 4B model. Default is 4
FIREWORKS_AI_16_B
Size parameter for Fireworks AI 16B model. Default is 16
FIREWORKS_AI_56_B_MOE
Size parameter for Fireworks AI 56B MOE model. Default is 56
FIREWORKS_AI_80_B
Size parameter for Fireworks AI 80B model. Default is 80
FIREWORKS_AI_176_B_MOE
Size parameter for Fireworks AI 176B MOE model. Default is 176
FOCUS_PROVIDER
Destination provider for Focus exports (e.g., s3). Defaults to s3.
FOCUS_FORMAT
Output format for Focus exports. Defaults to parquet.
FOCUS_FREQUENCY
Frequency for scheduled Focus exports (hourly, daily, or interval). Defaults to hourly.
FOCUS_CRON_OFFSET
Minute offset used when scheduling hourly/daily Focus exports. Defaults to 5 minutes.
FOCUS_INTERVAL_SECONDS
Interval (in seconds) for Focus exports when frequency is interval.
FOCUS_PREFIX
Object key prefix (or folder) used when uploading Focus export files. Defaults to focus_exports.
FOCUS_S3_BUCKET_NAME
S3 bucket to upload Focus export files when using the S3 destination.
FOCUS_S3_REGION_NAME
AWS region for the Focus export S3 bucket.
FOCUS_S3_ENDPOINT_URL
Custom endpoint for the Focus export S3 client (optional; useful for S3-compatible storage).
FOCUS_S3_ACCESS_KEY
AWS access key ID used by the Focus export S3 client.
FOCUS_S3_SECRET_KEY
AWS secret access key used by the Focus export S3 client.
FOCUS_S3_SESSION_TOKEN
AWS session token used by the Focus export S3 client (optional).
MAVVRIK_API_KEY
API key for the Mavvrik FOCUS export integration.
MAVVRIK_API_ENDPOINT
Tenant API endpoint for the Mavvrik FOCUS export, e.g. https://api.mavvrik.ai/<tenant_id>.
MAVVRIK_CONNECTION_ID
AI cost connection ID for the Mavvrik FOCUS export.
MAVVRIK_FOCUS_MAX_ROWS
Maximum rows per export window for the Mavvrik FOCUS destination. Default is 500000.
FOCUS_GCS_BUCKET_NAME
GCS bucket to upload Focus export files when using the GCS destination.
FOCUS_GCS_PATH_SERVICE_ACCOUNT
Path to a service account JSON key file for the Focus export GCS client. Falls back to Application Default Credentials if unset.
FUNCTION_DEFINITION_TOKEN_COUNT
Token count for function definitions. Default is 9
GALILEO_API_KEY
API key for Galileo Cloud (hosted). Used with the v2 spans API when success_callback includes galileo.
GALILEO_BASE_URL
Base URL for Galileo platform. For Galileo Cloud, use https://api.galileo.ai. For enterprise/self-hosted, replace console with api in your console URL.
GALILEO_LOG_STREAM_ID
Log stream ID for Galileo Cloud v2 spans logging (optional).
GALILEO_PASSWORD
Password for Galileo enterprise Observe authentication
GALILEO_PROJECT_ID
Project ID for Galileo usage
GALILEO_USERNAME
Username for Galileo enterprise Observe authentication
GOOGLE_SECRET_MANAGER_PROJECT_ID
Project ID for Google Secret Manager
GRACEFUL_SHUTDOWN_TIMEOUT
Seconds the proxy waits for in-flight requests to drain on shutdown (SIGTERM or the /health/drain preStop hook) before proceeding with teardown. Default is 30
GCS_BUCKET_NAME
Name of the Google Cloud Storage bucket
GCS_MOCK
Enable mock mode for GCS integration testing. When set to true, intercepts GCS API calls and returns mock responses without making actual network calls. Default is false
GCS_MOCK_LATENCY_MS
Mock latency in milliseconds for GCS API calls when mock mode is enabled. Simulates network round-trip time. Default is 150ms
GCS_PATH_SERVICE_ACCOUNT
Path to the Google Cloud service account JSON file
GCS_FLUSH_INTERVAL
Flush interval for GCS logging (in seconds). Specify how often you want a log to be sent to GCS. Default is 20 seconds
GCS_BATCH_SIZE
Batch size for GCS logging. Specify after how many logs you want to flush to GCS. If BATCH_SIZE is set to 10, logs are flushed every 10 logs. Default is 2048
GCS_USE_BATCHED_LOGGING
Enable batched logging for GCS. When enabled (default), multiple log payloads are combined into single GCS object uploads (NDJSON format), dramatically reducing API calls. When disabled, sends each log individually as separate GCS objects (legacy behavior). Default is true
GCS_PUBSUB_TOPIC_ID
PubSub Topic ID to send LiteLLM SpendLogs to.
GCS_PUBSUB_PROJECT_ID
PubSub Project ID to send LiteLLM SpendLogs to.
GENERIC_AUTHORIZATION_ENDPOINT
Authorization endpoint for generic OAuth providers
GENERIC_CLIENT_ID
Client ID for generic OAuth providers
GENERIC_CLIENT_SECRET
Client secret for generic OAuth providers
GENERIC_CLIENT_STATE
State parameter for generic client authentication
GENERIC_CLIENT_USE_PKCE
Enable PKCE (Proof Key for Code Exchange) for generic OAuth providers. Set to "true" when your OAuth provider requires PKCE. Default is false
GENERIC_SSO_HEADERS
Comma-separated list of additional headers to add to the request - e.g. Authorization=Bearer <token>, Content-Type=application/json, etc.
GENERIC_INCLUDE_CLIENT_ID
Include client ID in requests for OAuth
GENERIC_SCOPE
Scope settings for generic OAuth providers
GENERIC_TOKEN_ENDPOINT
Token endpoint for generic OAuth providers
GENERIC_USER_DISPLAY_NAME_ATTRIBUTE
Attribute for user's display name in generic auth
GENERIC_USER_EMAIL_ATTRIBUTE
Attribute for user's email in generic auth
GENERIC_USER_EXTRA_ATTRIBUTES
Comma-separated list of additional fields to extract from generic SSO provider response (e.g., "department,employee_id,groups"). Accessible via CustomOpenID.extra_fields in custom SSO handlers. Supports dot notation for nested fields
GENERIC_USER_FIRST_NAME_ATTRIBUTE
Attribute for user's first name in generic auth
GENERIC_USER_ID_ATTRIBUTE
Attribute for user ID in generic auth
GENERIC_USER_LAST_NAME_ATTRIBUTE
Attribute for user's last name in generic auth
GENERIC_USER_PROVIDER_ATTRIBUTE
Attribute specifying the user's provider
GENERIC_USER_ROLE_ATTRIBUTE
Attribute specifying the user's role
GENERIC_USERINFO_ENDPOINT
Endpoint to fetch user information in generic OAuth
GENERIC_LOGGER_ENDPOINT
Endpoint URL for the Generic Logger callback to send logs to
GENERIC_LOGGER_HEADERS
JSON string of headers to include in Generic Logger callback requests
GENERIC_ROLE_MAPPINGS_DEFAULT_ROLE
Default LiteLLM role to assign when no role mapping matches in generic SSO. Used with GENERIC_ROLE_MAPPINGS_ROLES
GENERIC_ROLE_MAPPINGS_GROUP_CLAIM
The claim/attribute name in the SSO token that contains the user's groups. Used for role mapping
GENERIC_ROLE_MAPPINGS_ROLES
Python dict string mapping LiteLLM roles to SSO group names. Example: {"proxy_admin": ["admin-group"], "internal_user": ["users"]}
GENERIC_USER_ROLE_MAPPINGS
Alternative to GENERIC_ROLE_MAPPINGS_ROLES for configuring user role mappings from SSO
GEMINI_API_BASE
Base URL for Gemini API. Default is https://generativelanguage.googleapis.com
GALILEO_API_KEY
API key for Galileo Cloud (hosted). Used with the v2 spans API when success_callback includes galileo.
GALILEO_BASE_URL
Base URL for Galileo platform. For Galileo Cloud, use https://api.galileo.ai. For enterprise/self-hosted, replace console with api in your console URL.
GALILEO_LOG_STREAM_ID
Log stream ID for Galileo Cloud v2 spans logging (optional).
GALILEO_PASSWORD
Password for Galileo enterprise Observe authentication
GALILEO_PROJECT_ID
Project ID for Galileo usage
GALILEO_USERNAME
Username for Galileo enterprise Observe authentication
GITHUB_COPILOT_TOKEN_DIR
Directory to store GitHub Copilot token for github_copilot llm provider
GITHUB_COPILOT_API_KEY_FILE
File to store GitHub Copilot API key for github_copilot llm provider
GITHUB_COPILOT_ACCESS_TOKEN_FILE
File to store GitHub Copilot access token for github_copilot llm provider
GITHUB_COPILOT_API_BASE
Base URL for GitHub Copilot API. For GitHub Enterprise subscriptions with custom host, it is similar to https://copilot-api.my-company.ghe.com. Default is https://api.githubcopilot.com
GITHUB_COPILOT_DEVICE_CODE_URL
URL for GitHub Copilot device code authentication. For GitHub Enterprise subscriptions with custom host, it is similar to https://my-company.ghe.com/login/device/code. Default is https://github.com/login/device/code
GITHUB_COPILOT_ACCESS_TOKEN_URL
URL for GitHub Copilot access token retrieval. For GitHub Enterprise subscriptions with custom host, it is similar to https://my-company.ghe.com/login/oauth/access_token. Default is https://github.com/login/oauth/access_token
GITHUB_COPILOT_API_KEY_URL
URL for GitHub Copilot API key retrieval. For GitHub Enterprise subscriptions with custom host, it is similar to https://my-company.ghe.com/api/v3/copilot_internal/v2/token. Default is https://api.github.com/copilot_internal/v2/token
GITHUB_COPILOT_CLIENT_ID
Client ID for GitHub Copilot device flow authentication. This is used by the github_copilot provider for device code authentication. Default is "Iv1.b507a08c87ecfe98"
GREENSCALE_API_KEY
API key for Greenscale service
GREENSCALE_ENDPOINT
Endpoint URL for Greenscale service
GRAYSWAN_API_BASE
Base URL for GraySwan API. Default is https://api.grayswan.ai
GRAYSWAN_API_KEY
API key for GraySwan Cygnal service
GRAYSWAN_REASONING_MODE
Reasoning mode for GraySwan guardrail
GRAYSWAN_VIOLATION_THRESHOLD
Violation threshold for GraySwan guardrail
GOOGLE_APPLICATION_CREDENTIALS
Path to Google Cloud credentials JSON file
GOOGLE_CLIENT_ID
Client ID for Google OAuth
GOOGLE_CLIENT_SECRET
Client secret for Google OAuth
GOOGLE_KMS_RESOURCE_NAME
Name of the resource in Google KMS
GUARDRAILS_AI_API_BASE
Base URL for Guardrails AI API
HEALTH_CHECK_TIMEOUT_SECONDS
Timeout in seconds for health checks. Default is 60
HEROKU_API_BASE
Base URL for Heroku API
HEROKU_API_KEY
API key for Heroku services
HF_API_BASE
Base URL for Hugging Face API
HCP_VAULT_ADDR
Address for Hashicorp Vault Secret Manager
HCP_VAULT_APPROLE_MOUNT_PATH
Mount path for AppRole authentication in Hashicorp Vault Secret Manager. Default is "approle"
HCP_VAULT_APPROLE_ROLE_ID
Role ID for AppRole authentication in Hashicorp Vault Secret Manager
HCP_VAULT_APPROLE_SECRET_ID
Secret ID for AppRole authentication in Hashicorp Vault Secret Manager
HCP_VAULT_CLIENT_CERT
Path to client certificate for Hashicorp Vault Secret Manager
HCP_VAULT_CLIENT_KEY
Path to client key for Hashicorp Vault Secret Manager
HCP_VAULT_MOUNT_NAME
Mount name for Hashicorp Vault Secret Manager
HCP_VAULT_NAMESPACE
Namespace for Hashicorp Vault Secret Manager
HCP_VAULT_PATH_PREFIX
Path prefix for Hashicorp Vault Secret Manager
HCP_VAULT_TOKEN
Token for Hashicorp Vault Secret Manager
HCP_VAULT_CERT_ROLE
Role for Hashicorp Vault Secret Manager Auth
HELICONE_API_KEY
API key for Helicone service
HELICONE_API_BASE
Base URL for Helicone service, defaults to https://api.helicone.ai
HELICONE_MOCK
Enable mock mode for Helicone integration testing. When set to true, intercepts Helicone API calls and returns mock responses without making actual network calls. Default is false
HELICONE_MOCK_LATENCY_MS
Mock latency in milliseconds for Helicone API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms
HOSTNAME
Hostname for the server, this will be emitted to datadog logs
HOURS_IN_A_DAY
Hours in a day for calculation purposes. Default is 24
HIDDENLAYER_API_BASE
Base URL for HiddenLayer API. Defaults to https://api.hiddenlayer.ai
HIDDENLAYER_AUTH_URL
Authentication URL for HiddenLayer. Defaults to https://auth.hiddenlayer.ai
HIDDENLAYER_CLIENT_ID
Client ID for HiddenLayer SaaS authentication
HIDDENLAYER_CLIENT_SECRET
Client secret for HiddenLayer SaaS authentication
HUGGINGFACE_API_BASE
Base URL for Hugging Face API
HUGGINGFACE_API_KEY
API key for Hugging Face API
HUMANLOOP_PROMPT_CACHE_TTL_SECONDS
Time-to-live in seconds for cached prompts in Humanloop. Default is 60
IAM_TOKEN_DB_AUTH
IAM token for database authentication
IBM_GUARDRAILS_API_BASE
Base URL for IBM Guardrails API
IBM_GUARDRAILS_AUTH_TOKEN
Authorization bearer token for IBM Guardrails API
INITIAL_RETRY_DELAY
Initial delay in seconds for retrying requests. Default is 0.5
JITTER
Jitter factor for retry delay calculations. Default is 0.75
JSON_LOGS
Enable JSON formatted logging
JWT_AUDIENCE
Expected audience for JWT tokens
JWT_ISSUER
Expected issuer (iss claim) for JWT tokens. When set, PyJWT verifies the iss claim and rejects tokens from other issuers
JWT_PUBLIC_KEY_URL
URL to fetch public key for JWT verification
LAGO_API_BASE
Base URL for Lago API
LAGO_API_CHARGE_BY
Parameter to determine charge basis in Lago
LAGO_API_EVENT_CODE
Event code for Lago API events
LAGO_API_KEY
API key for accessing Lago services
LANGFUSE_BASE_URL
Base URL for Langfuse service
LANGFUSE_DEBUG
Toggle debug mode for Langfuse
LANGFUSE_FLUSH_INTERVAL
Interval for flushing Langfuse logs
LANGFUSE_TRACING_ENVIRONMENT
Environment for Langfuse tracing
LANGFUSE_HOST
Deprecated host URL for Langfuse service
LANGFUSE_MOCK
Enable mock mode for Langfuse integration testing. When set to true, intercepts Langfuse API calls and returns mock responses without making actual network calls. Default is false
LANGFUSE_MOCK_LATENCY_MS
Mock latency in milliseconds for Langfuse API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms
LANGFUSE_PUBLIC_KEY
Public key for Langfuse authentication
LANGFUSE_RELEASE
Release version of Langfuse integration
LANGFUSE_SECRET_KEY
Secret key for Langfuse authentication
LANGFUSE_PROPAGATE_TRACE_ID
Flag to enable propagating trace ID to Langfuse. Default is False
LANGSMITH_API_KEY
API key for Langsmith platform
LANGSMITH_BASE_URL
Base URL for Langsmith service
LANGSMITH_BATCH_SIZE
Batch size for operations in Langsmith
LANGSMITH_DEFAULT_RUN_NAME
Default name for Langsmith run
LANGSMITH_PROJECT
Project name for Langsmith integration
LANGSMITH_SAMPLING_RATE
Sampling rate for Langsmith logging
LANGSMITH_TENANT_ID
Tenant ID for Langsmith multi-tenant deployments
LANGSMITH_MOCK
Enable mock mode for Langsmith integration testing. When set to true, intercepts Langsmith API calls and returns mock responses without making actual network calls. Default is false
LANGSMITH_MOCK_LATENCY_MS
Mock latency in milliseconds for Langsmith API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms
LANGTRACE_API_KEY
API key for Langtrace service
LASSO_API_BASE
Base URL for Lasso API
LASSO_API_KEY
API key for Lasso service
LASSO_USER_ID
User ID for Lasso service
LASSO_CONVERSATION_ID
Conversation ID for Lasso service
LENGTH_OF_LITELLM_GENERATED_KEY
Length of keys generated by LiteLLM. Default is 16
LEGACY_MULTI_INSTANCE_RATE_LIMITING
Flag to enable legacy multi-instance rate limiting. Default is False
LITERAL_API_KEY
API key for Literal integration
LITERAL_API_URL
API URL for Literal service
LITERAL_BATCH_SIZE
Batch size for Literal operations
LITELLM_ANTHROPIC_BETA_HEADERS_URL
Custom URL for fetching Anthropic beta headers configuration. Default is the GitHub main branch URL
LITELLM_ANTHROPIC_DISABLE_URL_SUFFIX
Disable automatic URL suffix appending for Anthropic API base URLs. When set to true, prevents LiteLLM from automatically adding /v1/messages or /v1/complete to custom Anthropic API endpoints
LITELLM_ASSETS_PATH
Path to directory for UI assets and logos. Used when running with read-only filesystem (e.g., Kubernetes). Default is /var/lib/litellm/assets in Docker.
LITELLM_BLOG_POSTS_URL
Custom URL for fetching LiteLLM blog posts JSON. Default is the GitHub main branch URL
LITELLM_CLI_JWT_EXPIRATION_HOURS
Expiration time in hours for CLI-generated JWT tokens. Default is 24 hours
LITELLM_CLI_SSO_CLAIM_MAP
Alias for CLI_SSO_CLAIM_MAP — allowlisted OIDC claims for CLI SSO attribution metadata
LITELLM_CORS_ALLOW_CREDENTIALS
Set to true to explicitly allow credentials in CORS responses. When not set, credentials are disabled automatically if LITELLM_CORS_ORIGINS is * (wildcard) to prevent the browser security misconfiguration of reflecting any origin with credentials
LITELLM_CORS_ORIGINS
Comma-separated list of allowed CORS origins (e.g. https://app.example.com,https://admin.example.com). Defaults to * (all origins) when not set
LITELLM_DD_AGENT_HOST
Hostname or IP of DataDog agent for LiteLLM-specific logging. When set, logs are sent to agent instead of direct API
LITELLM_DEPLOYMENT_ENVIRONMENT
Environment name for the deployment (e.g., "production", "staging"). Used as a fallback when OTEL_ENVIRONMENT_NAME is not set. Sets the environment tag in telemetry data
LITELLM_DETAILED_TIMING
When true, adds detailed per-phase timing headers to responses (x-litellm-timing-{pre-processing,llm-api,post-processing,message-copy}-ms). Default is false. See latency overhead docs
LITELLM_DD_AGENT_PORT
Port of DataDog agent for LiteLLM-specific log intake. Default is 10518
LITELLM_DD_LLM_OBS_PORT
Port for Datadog LLM Observability agent. Default is 8126
LITELLM_DEFAULT_EMBEDDING_ENCODING_FORMAT
Default encoding_format for OpenAI-compatible embedding calls when it is not set on the request or in model litellm_params (e.g. float, base64). Fallback is float. See Embeddings.
LITELLM_DEV_ENV_HOT_RELOAD
Internal flag the proxy sets on itself when started with --reload, signalling reloaded workers to re-read .env with override=True so edits to existing keys take effect on reload. Not meant to be set by users
LITELLM_DONT_SHOW_FEEDBACK_BOX
Flag to hide feedback box in LiteLLM UI
LITELLM_DROP_PARAMS
Parameters to drop in LiteLLM requests
LITELLM_MODIFY_PARAMS
Parameters to modify in LiteLLM requests
LITELLM_EMAIL
Email associated with LiteLLM account
LITELLM_FAVICON_URL
Custom URL for the LiteLLM UI favicon. When set, overrides the default favicon
LITELLM_GLOBAL_MAX_PARALLEL_REQUEST_RETRIES
Maximum retries for parallel requests in LiteLLM
LITELLM_GLOBAL_MAX_PARALLEL_REQUEST_RETRY_TIMEOUT
Timeout for retries of parallel requests in LiteLLM
LITELLM_DISABLE_LAZY_LOADING
When set to "1", "true", "yes", or "on", disables lazy loading of attributes (currently only affects encoding/tiktoken). This ensures encoding is initialized before VCR starts recording HTTP requests, fixing VCR cassette creation issues. See issue #18659
LITELLM_DISABLE_REDACT_SECRETS
When set to "true", disables automatic redaction of secrets (API keys, tokens, credentials) from proxy log output. Secret redaction is enabled by default.
LITELLM_MIGRATION_DIR
Custom migrations directory for prisma migrations, used for baselining db in read-only file systems.
LITELLM_HOSTED_UI
URL of the hosted UI for LiteLLM
LITELLM_UI_API_DOC_BASE_URL
Optional override for the API Reference base URL (used in sample code/docs) when the admin UI runs on a different host than the proxy. Defaults to PROXY_BASE_URL when unset.
LITELLM_UI_PATH
Path to directory for Admin UI files. Used when running with read-only filesystem (e.g., Kubernetes). Default is /var/lib/litellm/ui in Docker.
LITELLM_UI_SESSION_DURATION
Duration for UI login session (username/password, SSO, invitation links). Format: "30s", "30m", "24h", "7d". Does not apply to EXPERIMENTAL_UI_LOGIN flow, which uses a fixed 10-minute expiry for security. Default is "24h"
LITELLM_EXPIRED_UI_SESSION_KEY_CLEANUP_BATCH_SIZE
Maximum number of expired LiteLLM dashboard session keys to delete per cleanup run. Default is 1000.
LITELLM_EXPIRED_UI_SESSION_KEY_CLEANUP_ENABLED
Set to true to enable the background cleanup job for expired LiteLLM dashboard session keys. Default is false.
LITELLM_EXPIRED_UI_SESSION_KEY_CLEANUP_INTERVAL_SECONDS
Interval in seconds for how often to run the expired LiteLLM dashboard session key cleanup job. Default is 86400 (24 hours).
LITELM_ENVIRONMENT
Environment of LiteLLM Instance, used by logging services. Currently only used by DeepEval.
LITELLM_KEY_ROTATION_ENABLED
Enable auto-key rotation for LiteLLM (boolean). Default is false.
LITELLM_KEY_ROTATION_CHECK_INTERVAL_SECONDS
Interval in seconds for how often to run job that auto-rotates keys. Default is 86400 (24 hours).
LITELLM_KEY_ROTATION_GRACE_PERIOD
Duration to keep old key valid after rotation (e.g. "24h", "2d"). Default is empty (immediate revoke). Used for scheduled rotations and as fallback when not specified in regenerate request.
LITELLM_KEY_ROTATION_LOCK_TTL_SECONDS
TTL in seconds for the distributed lock used by the key rotation job. Default is 600 (10 minutes).
LITELLM_LICENSE
License key for LiteLLM usage
LITELLM_LOCAL_ANTHROPIC_BETA_HEADERS
Set to True to use the local bundled Anthropic beta headers config only, disabling remote fetching. Default is False
LITELLM_OIDC_ALLOWED_CREDENTIAL_DIRS
Comma-separated list of absolute directories from which the oidc/file/ provider is permitted to read token files. Defaults to /var/run/secrets,/run/secrets.
LITELLM_LOCAL_BLOG_POSTS
When set to True, uses the local bundled blog posts only, disabling remote fetching from GitHub. Default is False
LITELLM_LOCAL_MODEL_COST_MAP
Local configuration for model cost mapping in LiteLLM
LITELLM_LOCAL_POLICY_TEMPLATES
When set to "true", uses local backup policy templates instead of fetching from GitHub. Policy templates are fetched from https://raw.githubusercontent.com/BerriAI/litellm/main/policy_templates.json by default, with automatic fallback to local backup on failure
LITELLM_LOG
Enable detailed logging for LiteLLM
LITELLM_MODEL_COST_MAP_URL
URL for fetching model cost map data. Default is https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json
LITELLM_LOG_FILE
File path to write LiteLLM logs to. When set, logs will be written to both console and the specified file
LITELLM_LOGGER_NAME
Name for OTEL logger
LITELLM_METER_NAME
Name for OTEL Meter
LITELLM_OTEL_INTEGRATION_ENABLE_EVENTS
Optionally enable semantic logs (gen_ai.content.prompt/gen_ai.content.completion, or gen_ai.client.inference.operation.details in semconv mode) for OTEL. Default false. See OpenTelemetry
LITELLM_OTEL_INTEGRATION_ENABLE_METRICS
Optionally enable semantic metrics (TTFT, TPOT, response duration, cost, token usage) for OTEL. Default false. See OpenTelemetry
LITELLM_OTEL_BAGGAGE_TEAM_METADATA_KEYS
Comma-separated allowlist of team-metadata sub-keys promoted onto OTEL spans under litellm.team.metadata. Empty by default, so none of a team's free-form metadata is sent to your tracing backend until each sub-key is explicitly allowlisted. Also settable as baggage_team_metadata_keys under callback_settings.otel in config.yaml. See OpenTelemetry.
LITELLM_ENABLE_PYROSCOPE
If true, enables Pyroscope CPU profiling. Profiles are sent to PYROSCOPE_SERVER_ADDRESS. Off by default. See Pyroscope profiling.
LITELLM_ENABLE_TEAM_STALE_ALIAS_BYPASS
When true, if a team's legacy model_aliases entry maps a public model name to an internal model_name_<team_id>_<uuid> deployment, pre-call handling can skip that rewrite when team-scoped sibling deployments exist for the public name—so load balancing / order apply across siblings. Default is false for backwards compatibility. See Team-scoped models and legacy aliases. When stale aliases are detected and this flag is off, the proxy may log a one-time warning.
PYROSCOPE_APP_NAME
Application name reported to Pyroscope. Required when LITELLM_ENABLE_PYROSCOPE is true. No default.
PYROSCOPE_SERVER_ADDRESS
Pyroscope server URL to send profiles to. Required when LITELLM_ENABLE_PYROSCOPE is true. No default.
PYROSCOPE_SAMPLE_RATE
Optional. Sample rate for Pyroscope profiling (integer). No default; when unset, the pyroscope-io library default is used.
PYROSCOPE_GRAFANA_USER
Optional. Grafana Cloud Pyroscope user/tenant ID for basic auth. Required when PYROSCOPE_GRAFANA_API_TOKEN is set.
PYROSCOPE_GRAFANA_API_TOKEN
Optional. Grafana Cloud API/access policy token for Pyroscope basic auth. Required when PYROSCOPE_GRAFANA_USER is set.
LITELLM_MASTER_KEY
Master key for proxy authentication
LITELLM_MAX_BUDGET_PER_SESSION_TTL
TTL in seconds for session budget counters used by the max-budget-per-session limiter. Default is 3600 (1 hour)
LITELLM_MAX_ITERATIONS_TTL
TTL in seconds for session iteration counters used by the max-iterations limiter. Default is 3600 (1 hour)
LITELLM_MAX_STREAMING_DURATION_SECONDS
Maximum duration in seconds allowed for a streaming response. Streams exceeding this duration are terminated with a Timeout error. Default is None (no limit)
LITELLM_STREAM_INACTIVITY_TIMEOUT_SECONDS
Maximum seconds to wait for the next chunk from an async streaming provider before raising a Timeout. Guards against a provider that keeps the connection warm with keepalive bytes but stops sending content. Default is None (disabled)
LITELLM_MODE
Operating mode for LiteLLM (e.g., production, development)
LITELLM_NON_ROOT
Flag to run LiteLLM in non-root mode for enhanced security in Docker containers
LITELLM_RATE_LIMIT_WINDOW_SIZE
Rate limit window size for LiteLLM. Default is 60
LITELLM_REASONING_AUTO_SUMMARY
If set to "true", automatically enables detailed reasoning summaries (summary: "detailed") for reasoning models across all translation paths (Anthropic adapter, Responses API, etc.). Default is "false"
LITELLM_SALT_KEY
Salt key for encryption in LiteLLM
LITELLM_SENSITIVE_ROUTING_TTL
TTL in seconds for sticky sensitive-data routing decisions; controls how long a session stays pinned to the on-premise model selected by a routing guardrail. Default is 3600
LITELLM_SSL_CIPHERS
SSL/TLS cipher configuration for faster handshakes. Controls cipher suite preferences for OpenSSL connections.
LITELLM_SECRET_AWS_KMS_LITELLM_LICENSE
AWS KMS encrypted license for LiteLLM
LITELLM_TOKEN
Access token for LiteLLM integration
LITELLM_TPM_TOKEN_RESERVATION_ENABLED
When false, the v3 rate limiter skips the upfront TPM token reservation and enforces TPM post-call from actual usage. Default is true
LITELLM_USE_CHAT_COMPLETIONS_URL_FOR_ANTHROPIC_MESSAGES
When set to "true", routes OpenAI /v1/messages requests through chat/completions instead of the Responses API for Anthropic models. Can also be set via litellm_settings.use_chat_completions_url_for_anthropic_messages
LITELLM_ROUTE_ALL_CHAT_OPENAI_TO_RESPONSES
When set to "true", routes all OpenAI /chat/completions requests through the Responses API bridge. Recommended for OpenAI models. Can also be set via litellm_settings.route_all_chat_openai_to_responses
LITELLM_GEMINI_LIVE_DEFER_SETUP
When set to "true", defers Gemini/Vertex Live setup until the client sends session.update (required for runtime tool injection). Default is "false" for backwards compatibility, which auto-sends setup on connect. Can also be set via litellm.gemini_live_defer_setup
LITELLM_USE_LEGACY_INTERACTIONS_SCHEMA
When set to "true", uses the legacy Google Interactions API schema (outputs array, 2026-05-07 revision) instead of the new schema (steps array, 2026-05-20 revision). The legacy schema will be sunset on June 8, 2026. Can also be set via litellm_settings.use_legacy_interactions_schema
LITELLM_USER_AGENT
Custom user agent string for LiteLLM API requests. Used for partner telemetry attribution
LITELLM_WORKER_STARTUP_HOOKS
Comma-separated list of module.path:function_name callables to run in each worker process during startup. Runs early in the worker lifecycle (before config/DB loading). Useful for re-initializing per-process state like gflags. See Worker Startup Hooks for details
LITELLM_PRINT_STANDARD_LOGGING_PAYLOAD
If true, prints the standard logging payload to the console - useful for debugging
LITELM_ENVIRONMENT
Environment for LiteLLM Instance. This is currently only logged to DeepEval to determine the environment for DeepEval integration.
LITELLM_ASYNCIO_QUEUE_MAXSIZE
Maximum size for asyncio queues (e.g. log queues, spend update queues, and cookbook examples such as realtime audio in nova_sonic_realtime.py). Bounds in-memory growth to prevent OOM. Default is 1000.
LOGFIRE_TOKEN
Token for Logfire logging service
LOGFIRE_BASE_URL
Base URL for Logfire logging service (useful for self hosted deployments)
LOGGING_WORKER_CONCURRENCY
Maximum number of concurrent coroutine slots for the logging worker on the asyncio event loop. Default is 100. Setting too high will flood the event loop with logging tasks which will lower the overall latency of the requests.
LOGGING_WORKER_MAX_QUEUE_SIZE
Maximum size of the logging worker queue. When the queue is full, the worker aggressively clears tasks to make room instead of dropping logs. Default is 50,000
LOGGING_WORKER_MAX_TIME_PER_COROUTINE
Maximum time in seconds allowed for each coroutine in the logging worker before timing out. Default is 20.0
LOGGING_WORKER_CLEAR_PERCENTAGE
Percentage of the queue to extract when clearing. Default is 50%
MAX_BASE64_LENGTH_FOR_LOGGING
Maximum number of base64 characters to keep in logging payloads. Data URIs exceeding this are replaced with a size placeholder. Set to 0 to disable truncation. Default is 64
MAX_COMPETITOR_NAMES
Maximum number of competitor names allowed in policy template enrichment. Default is 100
MAX_EXCEPTION_MESSAGE_LENGTH
Maximum length for exception messages. Default is 2000
MAX_ITERATIONS_TO_CLEAR_QUEUE
Maximum number of iterations to attempt when clearing the logging worker queue during shutdown. Default is 200
MAX_TIME_TO_CLEAR_QUEUE
Maximum time in seconds to spend clearing the logging worker queue during shutdown. Default is 5.0
LOGGING_WORKER_AGGRESSIVE_CLEAR_COOLDOWN_SECONDS
Cooldown time in seconds before allowing another aggressive clear operation when the queue is full. Default is 0.5
MAX_STRING_LENGTH_PROMPT_IN_DB
Maximum length for strings in spend logs when sanitizing request bodies. Strings longer than this will be truncated. Default is 1000
MAX_IN_MEMORY_QUEUE_FLUSH_COUNT
Maximum count for in-memory queue flush operations. Default is 1000
MAX_IMAGE_URL_DOWNLOAD_SIZE_MB
Maximum size in MB for downloading images from URLs. Prevents memory issues from downloading very large images. Images exceeding this limit will be rejected before download. Set to 0 to completely disable image URL handling (all image_url requests will be blocked). Default is 50MB (matching OpenAI's limit)
MAX_LONG_SIDE_FOR_IMAGE_HIGH_RES
Maximum length for the long side of high-resolution images. Default is 2000
MAX_REDIS_BUFFER_DEQUEUE_COUNT
Maximum count for Redis buffer dequeue operations. Default is 100
MAX_SHORT_SIDE_FOR_IMAGE_HIGH_RES
Maximum length for the short side of high-resolution images. Default is 768
MAX_SIZE_IN_MEMORY_QUEUE
Maximum size for in-memory queue. Default is 10000
MAX_SIZE_PER_ITEM_IN_MEMORY_CACHE_IN_KB
Maximum size in KB for each item in memory cache. Default is 512 or 1024
MAX_SPENDLOG_ROWS_TO_QUERY
Maximum number of spend log rows to query. Default is 1,000,000
MAX_TEAM_LIST_LIMIT
Maximum number of teams to list. Default is 20
MAX_TILE_HEIGHT
Maximum height for image tiles. Default is 512
MAX_TILE_WIDTH
Maximum width for image tiles. Default is 512
MAX_TOKEN_TRIMMING_ATTEMPTS
Maximum number of attempts to trim a token message. Default is 10
MAXIMUM_TRACEBACK_LINES_TO_LOG
Maximum number of lines to log in traceback in LiteLLM Logs UI. Default is 100
MAX_RETRY_DELAY
Maximum delay in seconds for retrying requests. Default is 8.0
MAX_LANGFUSE_INITIALIZED_CLIENTS
Maximum number of Langfuse clients to initialize on proxy. Default is 50. This is set since langfuse initializes 1 thread everytime a client is initialized. We've had an incident in the past where we reached 100% cpu utilization because Langfuse was initialized several times.
MAX_MCP_SEMANTIC_FILTER_TOOLS_HEADER_LENGTH
Maximum header length for MCP semantic filter tools. Default is 150
MAX_POLICY_ESTIMATE_IMPACT_ROWS
Maximum number of rows returned when estimating the impact of a policy. Default is 1000
MAX_PAYLOAD_SIZE_FOR_DEBUG_LOG
Maximum payload size in bytes for full DEBUG serialization. Payloads exceeding this will be truncated in logs. Default is 102400 (100 KB)
MIN_NON_ZERO_TEMPERATURE
Minimum non-zero temperature value. Default is 0.0001
MINIMUM_PROMPT_CACHE_TOKEN_COUNT
Minimum token count for caching a prompt. Default is 1024
MISTRAL_API_BASE
Base URL for Mistral API. Default is https://api.mistral.ai
MISTRAL_API_KEY
API key for Mistral API
MICROSOFT_AUTHORIZATION_ENDPOINT
Custom authorization endpoint URL for Microsoft SSO (overrides default Microsoft OAuth authorization endpoint)
MICROSOFT_CLIENT_ID
Client ID for Microsoft services
MICROSOFT_CLIENT_SECRET
Client secret for Microsoft services
MICROSOFT_SERVICE_PRINCIPAL_ID
Service Principal ID for Microsoft Enterprise Application. (This is an advanced feature if you want litellm to auto-assign members to Litellm Teams based on their Microsoft Entra ID Groups)
MICROSOFT_TENANT
Tenant ID for Microsoft Azure
MICROSOFT_TOKEN_ENDPOINT
Custom token endpoint URL for Microsoft SSO (overrides default Microsoft OAuth token endpoint)
MICROSOFT_USER_DISPLAY_NAME_ATTRIBUTE
Field name for user display name in Microsoft SSO response. Default is displayName
MICROSOFT_USER_EMAIL_ATTRIBUTE
Field name for user email in Microsoft SSO response. Default is userPrincipalName
MICROSOFT_USER_FIRST_NAME_ATTRIBUTE
Field name for user first name in Microsoft SSO response. Default is givenName
MICROSOFT_USER_ID_ATTRIBUTE
Field name for user ID in Microsoft SSO response. Default is id
MICROSOFT_USER_LAST_NAME_ATTRIBUTE
Field name for user last name in Microsoft SSO response. Default is surname
MICROSOFT_USERINFO_ENDPOINT
Custom userinfo endpoint URL for Microsoft SSO (overrides default Microsoft Graph userinfo endpoint)
MODEL_COST_MAP_MAX_SHRINK_RATIO
Maximum allowed shrinkage ratio when validating a fetched model cost map against the local backup. Rejects the fetched map if it is smaller than this fraction of the backup. Default is 0.5
MODEL_COST_MAP_MIN_MODEL_COUNT
Minimum number of models a fetched cost map must contain to be considered valid. Default is 50
NEW_RELIC_APP_NAME
Application name for New Relic AI Monitoring integration
NEW_RELIC_LICENSE_KEY
License key for New Relic authentication
NO_DOCS
Flag to disable Swagger UI documentation
NO_OPENAPI
Flag to disable the /openapi.json endpoint
NO_REDOC
Flag to disable Redoc documentation
NO_PROXY
List of addresses to bypass proxy
NON_LLM_CONNECTION_TIMEOUT
Timeout in seconds for non-LLM service connections. Default is 15
OAUTH_TOKEN_INFO_ENDPOINT
Endpoint for OAuth token info retrieval
OPENAI_BASE_URL
Base URL for OpenAI API
OPENAI_API_BASE
Base URL for OpenAI API. Default is https://api.openai.com/
OPENAI_API_KEY
API key for OpenAI services
OPENAI_CHATGPT_API_BASE
Alternative to CHATGPT_API_BASE. Base URL for ChatGPT API
OPENAI_FILE_SEARCH_COST_PER_1K_CALLS
Cost per 1000 calls for OpenAI file search. Default is 0.0025
OPENAI_ORGANIZATION
Organization identifier for OpenAI
OPENAPI_URL
The path to the OpenAPI JSON endpoint. By default this is "/openapi.json"
OPENID_BASE_URL
Base URL for OpenID Connect services
OPENID_CLIENT_ID
Client ID for OpenID Connect authentication
OPENID_CLIENT_SECRET
Client secret for OpenID Connect authentication
OPENMETER_API_ENDPOINT
API endpoint for OpenMeter integration
OPENMETER_API_KEY
API key for OpenMeter services
OPENMETER_EVENT_TYPE
Type of events sent to OpenMeter
OPENMETER_TRUST_REQUEST_USER
If false, ignore the request body user and resolve the OpenMeter subject from the authenticated key's user_id. Defaults to true
ONYX_API_BASE
Base URL for Onyx Security AI Guard service (defaults to https://ai-guard.onyx.security)
ONYX_API_KEY
API key for Onyx Security AI Guard service
ONYX_TIMEOUT
Timeout in seconds for Onyx Guard server requests. Default is 10
OTEL_ENDPOINT
OpenTelemetry endpoint for traces
OTEL_EXPORTER_OTLP_ENDPOINT
OpenTelemetry endpoint for traces
OTEL_ENVIRONMENT_NAME
Environment name for OpenTelemetry
OTEL_EXPORTER
Exporter type for OpenTelemetry
OTEL_EXPORTER_OTLP_PROTOCOL
Exporter type for OpenTelemetry
OTEL_HEADERS
Headers for OpenTelemetry requests
OTEL_MODEL_ID
Model ID for OpenTelemetry tracing
OTEL_EXPORTER_OTLP_HEADERS
Headers for OpenTelemetry requests
OTEL_SERVICE_NAME
Service name identifier for OpenTelemetry
OTEL_TRACER_NAME
Tracer name for OpenTelemetry tracing
OTEL_LOGS_EXPORTER
Exporter type for OpenTelemetry logs (e.g., console)
OTEL_IGNORE_CONTEXT_PROPAGATION
When true, ignore parent span context propagation (inbound traceparent headers and any active span) so every LiteLLM trace is its own root. Default false
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT
Controls whether prompts and completions are captured in OpenTelemetry traces. Accepts NO_CONTENT (default per spec), SPAN_ONLY, EVENT_ONLY, SPAN_AND_EVENT, or the boolean form (true maps to EVENT_ONLY, false to NO_CONTENT)
OTEL_SEMCONV_STABILITY_OPT_IN
Set to gen_ai_latest_experimental to emit spans following the latest OpenTelemetry GenAI semantic conventions. Renames the LLM-call span to {operation} {model}, suppresses raw_gen_ai_request, adds gen_ai.provider.name, and consolidates events. Comma-separable per OTEL spec
USE_OTEL_LITELLM_REQUEST_SPAN
When true, the proxy emits a discrete litellm_request span per LLM call as a child of the Received Proxy Server Request span. Default false (since v1.81.0); LLM-call attributes are set directly on the proxy root span. See Why don't I see a litellm_request span?
OTEL_DEBUG
When true, prints exporter and span-creation diagnostics to stderr. Useful when traces aren't reaching your backend. Default false
DEBUG_OTEL
Alias for OTEL_DEBUG
PAGERDUTY_API_KEY
API key for PagerDuty Alerting
PANW_PRISMA_AIRS_API_KEY
API key for PANW Prisma AIRS service
PANW_PRISMA_AIRS_API_BASE
Base URL for PANW Prisma AIRS service
PHOENIX_API_KEY
API key for Arize Phoenix
PHOENIX_COLLECTOR_ENDPOINT
API endpoint for Arize Phoenix
PHOENIX_COLLECTOR_HTTP_ENDPOINT
API http endpoint for Arize Phoenix
PILLAR_API_BASE
Base URL for Pillar API Guardrails
PILLAR_API_KEY
API key for Pillar API Guardrails
PILLAR_ON_FLAGGED_ACTION
Action to take when content is flagged ('block' or 'monitor')
PKCE_STRICT_CACHE_MISS
When set to true, the SSO callback will return a 401 error if the PKCE code_verifier is not found in the cache (e.g. due to a cache miss across pods). When false (default), it logs a warning and continues without the code_verifier.
POD_NAME
Pod name for the server, this will be emitted to datadog logs as POD_NAME
POSTHOG_API_KEY
API key for PostHog analytics integration
POSTHOG_API_URL
Base URL for PostHog API (defaults to https://us.i.posthog.com)
POSTHOG_MOCK
Enable mock mode for PostHog integration testing. When set to true, intercepts PostHog API calls and returns mock responses without making actual network calls. Default is false
POSTHOG_MOCK_LATENCY_MS
Mock latency in milliseconds for PostHog API calls when mock mode is enabled. Simulates network round-trip time. Default is 100ms
PRISMA_AUTH_RECONNECT_LOCK_TIMEOUT_SECONDS
Lock timeout in seconds for Prisma auth reconnection. Default is 0.1
PRISMA_AUTH_RECONNECT_TIMEOUT_SECONDS
Timeout in seconds for Prisma auth reconnection attempts. Default is 2.0
PRISMA_HEALTH_WATCHDOG_ENABLED
Enable the Prisma DB health watchdog that monitors and reconnects on connection loss. Default is true
PRISMA_HEALTH_WATCHDOG_INTERVAL_SECONDS
Interval in seconds for Prisma health watchdog probes. Default is 30
PRISMA_HEALTH_WATCHDOG_PROBE_TIMEOUT_SECONDS
Timeout in seconds for each Prisma health probe. Default is 5.0
PRISMA_RECONNECT_COOLDOWN_SECONDS
Cooldown in seconds between Prisma reconnection attempts. Default is 15
PRISMA_RECONNECT_ESCALATION_THRESHOLD
Number of consecutive reconnect failures before escalating the reconnection strategy. Default is 3
PRISMA_WATCHDOG_RECONNECT_TIMEOUT_SECONDS
Timeout in seconds for Prisma watchdog-initiated reconnection. Default is 30.0
PREDIBASE_API_BASE
Base URL for Predibase API
PRESIDIO_ANALYZER_API_BASE
Base URL for Presidio Analyzer service
PRESIDIO_ANONYMIZER_API_BASE
Base URL for Presidio Anonymizer service
PROMETHEUS_BUDGET_METRICS_REFRESH_INTERVAL_MINUTES
Refresh interval in minutes for Prometheus budget metrics. Default is 5
PROMETHEUS_FALLBACK_STATS_SEND_TIME_HOURS
Fallback time in hours for sending stats to Prometheus. Default is 9
PROMETHEUS_URL
URL for Prometheus service
PROMPTLAYER_API_KEY
API key for PromptLayer integration
PROXY_ADMIN_ID
Admin identifier for proxy server
PROXY_BASE_URL
Base URL for proxy service. Also used by the MCP OAuth authorize endpoint as the proxy's public origin when validating browser-supplied redirect_uri values — set this to the exact origin users see in their address bar (e.g. https://llm.example.com) when LiteLLM runs behind a TLS-terminating ingress. Full origin only: scheme + host (+ port if non-default), no trailing slash, no path. When set, it takes precedence over X-Forwarded-* headers (which only apply when use_x_forwarded_for is true AND the request peer is in mcp_trusted_proxy_ranges). See MCP OAuth — Reverse proxy and ingress configuration.
PROXY_BATCH_WRITE_AT
Time in seconds to wait before batch writing spend logs to the database. Default is 10
PROXY_BATCH_POLLING_INTERVAL
Time in seconds to wait before polling a batch, to check if it's completed. Default is 6000s (1 hour)
PROXY_BATCH_POLLING_ENABLED
Set to false to disable the CheckBatchCost and CheckResponsesCost background polling jobs entirely. Useful for emergency mitigation on installs with large numbers of stale managed objects. Default is true
MAX_OBJECTS_PER_POLL_CYCLE
Maximum number of managed objects (batches / responses) fetched per polling cycle. Prevents OOM on installs with many stale rows. Default is 50
MANAGED_OBJECT_STALENESS_CUTOFF_DAYS
Managed objects older than this many days in a non-terminal state are marked stale_expired at the start of each poll cycle and skipped. Default is 7
PROXY_BUDGET_RESCHEDULER_MAX_TIME
Maximum time in seconds to wait before checking database for budget resets. Default is 605
PROXY_BUDGET_RESCHEDULER_MIN_TIME
Minimum time in seconds to wait before checking database for budget resets. Default is 597
PYTHON_GC_THRESHOLD
GC thresholds ('gen0,gen1,gen2', e.g. '1000,50,50'); defaults to Python’s values.
PROXY_LOGOUT_URL
URL for logging out of the proxy service
QDRANT_API_BASE
Base URL for Qdrant API
QDRANT_API_KEY
API key for Qdrant service
QDRANT_SCALAR_QUANTILE
Scalar quantile for Qdrant operations. Default is 0.99
QDRANT_URL
Connection URL for Qdrant database
QDRANT_VECTOR_SIZE
Vector size for Qdrant operations. Default is 1536
REDIS_CONNECTION_POOL_TIMEOUT
Timeout in seconds for Redis connection pool. Default is 5
REDIS_CIRCUIT_BREAKER_ENABLED
When false, the Redis circuit breaker is disabled and never opens. Default is true
REDIS_CIRCUIT_BREAKER_FAILURE_THRESHOLD
Number of consecutive failures before the Redis circuit breaker opens. Default is 5
REDIS_CIRCUIT_BREAKER_RECOVERY_TIMEOUT
Time in seconds before the Redis circuit breaker attempts recovery after opening. Default is 60
REDIS_CLUSTER_NODES
JSON-formatted list of Redis cluster startup nodes for Redis Cluster mode. Example: [{"host": "node1", "port": 6379}]
REDIS_HOST
Hostname for Redis server
REDIS_PASSWORD
Password for Redis service
REDIS_PORT
Port number for Redis server
REDIS_SOCKET_TIMEOUT
Timeout in seconds for Redis socket operations. Default is 0.1
REDIS_GCP_SERVICE_ACCOUNT
GCP service account for IAM authentication with Redis. Format: "projects/-/serviceAccounts/name@project.iam.gserviceaccount.com"
REDIS_GCP_SSL_CA_CERTS
Path to SSL CA certificate file for secure GCP Memorystore Redis connections
REDOC_URL
The path to the Redoc Fast API documentation. By default this is "/redoc"
REPEATED_STREAMING_CHUNK_LIMIT
Limit for repeated streaming chunks to detect looping. Default is 100
REALTIME_WEBSOCKET_MAX_MESSAGE_SIZE_BYTES
Maximum size in bytes for WebSocket messages in realtime connections. Default is None.
REPLICATE_MODEL_NAME_WITH_ID_LENGTH
Length of Replicate model names with ID. Default is 64
REPLICATE_POLLING_DELAY_SECONDS
Delay in seconds for Replicate polling operations. Default is 0.5
REQUEST_TIMEOUT
Timeout in seconds for requests. Default is 6000
ROOT_REDIRECT_URL
URL to redirect root path (/) to when DOCS_URL is set to something other than "/" (DOCS_URL is "/" by default)
ROUTER_MAX_FALLBACKS
Maximum number of fallbacks for router. Default is 5
RUBRIK_API_KEY
Bearer token for authenticating with the Rubrik webhook service
RUBRIK_BATCH_SIZE
Number of log entries to buffer before flushing to Rubrik. Default is 512
RUBRIK_SAMPLING_RATE
Fraction of requests to log to Rubrik (0.0 to 1.0). Default is 1.0
RUBRIK_WEBHOOK_URL
Base URL of the Rubrik webhook service for tool blocking and batch logging
RUNWAYML_DEFAULT_API_VERSION
Default API version for RunwayML service. Default is "2024-11-06"
RUNWAYML_POLLING_TIMEOUT
Timeout in seconds for RunwayML image generation polling. Default is 600 (10 minutes)
S3_VECTORS_DEFAULT_DIMENSION
Default vector dimension for S3 Vectors RAG ingestion. Default is 1024
S3_VECTORS_DEFAULT_DISTANCE_METRIC
Default distance metric for S3 Vectors RAG ingestion. Options: "cosine", "euclidean". Default is "cosine"
SECRET_MANAGER_REFRESH_INTERVAL
Refresh interval in seconds for secret manager. Default is 86400 (24 hours)
SERVER_ROOT_PATH
Root path for the server application
SEND_USER_API_KEY_ALIAS
Flag to send user API key alias to Zscaler AI Guard. Default is False
SEND_USER_API_KEY_TEAM_ID
Flag to send user API key team ID to Zscaler AI Guard. Default is False
SEND_USER_API_KEY_USER_ID
Flag to send user API key user ID to Zscaler AI Guard. Default is False
SET_VERBOSE
[DEPRECATED] Use LITELLM_LOG instead with values "INFO", "DEBUG", or "ERROR". See debugging docs
SINGLE_DEPLOYMENT_TRAFFIC_FAILURE_THRESHOLD
Minimum number of requests to consider "reasonable traffic" for single-deployment cooldown logic. Default is 1000
SLACK_DAILY_REPORT_FREQUENCY
Frequency of daily Slack reports (e.g., daily, weekly)
SLACK_WEBHOOK_URL
Webhook URL for Slack integration
SMTP_HOST
Hostname for the SMTP server
SMTP_PASSWORD
Password for SMTP authentication (do not set if SMTP does not require auth)
SMTP_PORT
Port number for SMTP server
SMTP_SENDER_EMAIL
Email address used as the sender in SMTP transactions
SMTP_SENDER_LOGO
Logo used in emails sent via SMTP
SMTP_TLS
Flag to enable or disable TLS for SMTP connections
SMTP_USE_SSL
Set to "True" to force implicit SSL (SMTP_SSL) on any port. Not needed for port 465, which uses implicit SSL automatically; other ports use STARTTLS by default (see SMTP_TLS)
SMTP_USERNAME
Username for SMTP authentication (do not set if SMTP does not require auth)
SENDGRID_API_KEY
API key for SendGrid email service
RESEND_API_KEY
API key for Resend email service
SENDGRID_SENDER_EMAIL
Email address used as the sender in SendGrid email transactions
SPEND_LOGS_URL
URL for retrieving spend logs
SPEND_LOG_CLEANUP_BATCH_SIZE
Number of logs deleted per batch during cleanup. Default is 1000
STALE_OBJECT_CLEANUP_BATCH_SIZE
Max number of stale managed objects updated per cleanup cycle. Default is 1000
SSL_CERTIFICATE
Path to the SSL certificate file
SSL_ECDH_CURVE
ECDH curve for SSL/TLS key exchange (e.g., 'X25519' to disable PQC).
SSL_SECURITY_LEVEL
[BETA] Security level for SSL/TLS connections. E.g. DEFAULT@SECLEVEL=1
SSL_VERIFY
Flag to enable or disable SSL certificate verification
SSL_CERT_FILE
Path to the SSL certificate file for custom CA bundle
SUPABASE_KEY
API key for Supabase service
SUPABASE_URL
Base URL for Supabase instance
STORE_MODEL_IN_DB
If true, enables storing model + credential information in the DB.
SYSTEM_MESSAGE_TOKEN_COUNT
Token count for system messages. Default is 4
TEST_EMAIL_ADDRESS
Email address used for testing purposes
TOGETHER_AI_4_B
Size parameter for Together AI 4B model. Default is 4
TOGETHER_AI_8_B
Size parameter for Together AI 8B model. Default is 8
TOGETHER_AI_21_B
Size parameter for Together AI 21B model. Default is 21
TOGETHER_AI_41_B
Size parameter for Together AI 41B model. Default is 41
TOGETHER_AI_80_B
Size parameter for Together AI 80B model. Default is 80
TOGETHER_AI_110_B
Size parameter for Together AI 110B model. Default is 110
TOGETHER_AI_EMBEDDING_150_M
Size parameter for Together AI 150M embedding model. Default is 150
TOGETHER_AI_EMBEDDING_350_M
Size parameter for Together AI 350M embedding model. Default is 350
TOOL_CHOICE_OBJECT_TOKEN_COUNT
Token count for tool choice objects. Default is 4
TOOL_POLICY_CACHE_TTL_SECONDS
TTL in seconds for caching tool policy guardrail results. Default is 60
UI_LOGO_PATH
Path to the logo image used in the UI
UI_PASSWORD
Password for accessing the UI
UI_USERNAME
Username for accessing the UI
UPSTREAM_LANGFUSE_DEBUG
Flag to enable debugging for upstream Langfuse
UPSTREAM_LANGFUSE_HOST
Host URL for upstream Langfuse service
UPSTREAM_LANGFUSE_PUBLIC_KEY
Public key for upstream Langfuse authentication
UPSTREAM_LANGFUSE_RELEASE
Release version identifier for upstream Langfuse
UPSTREAM_LANGFUSE_SECRET_KEY
Secret key for upstream Langfuse authentication
USE_AWS_KMS
Flag to enable AWS Key Management Service for encryption
USE_PRISMA_MIGRATE
Flag to use prisma migrate instead of prisma db push. Recommended for production environments.
VANTAGE_API_KEY
API key for Vantage cost-import integration
VANTAGE_BASE_URL
Base URL for Vantage API. Default is https://api.vantage.sh
VANTAGE_EXPORT_FREQUENCY
Export frequency for Vantage — hourly (default), daily, or interval
VANTAGE_EXPORT_INTERVAL_SECONDS
Interval in seconds when VANTAGE_EXPORT_FREQUENCY is interval
VANTAGE_INTEGRATION_TOKEN
Vantage integration token for the cost-import endpoint
WANDB_API_KEY
API key for Weights & Biases (W&B) logging integration
WANDB_HOST
Host URL for Weights & Biases (W&B) service
WANDB_PROJECT_ID
Project ID for Weights & Biases (W&B) logging integration
WEBHOOK_URL
URL for receiving webhooks from external services
SPEND_LOG_RUN_LOOPS
Constant for setting how many runs of 1000 batch deletes should spend_log_cleanup task run
SPEND_LOG_CLEANUP_BATCH_SIZE
Number of logs deleted per batch during cleanup. Default is 1000
SPEND_LOG_PARTITION_INTERVAL
Granularity of LiteLLM_SpendLogs partitions when the table is partitioned: day, week, or month. Default is day
SPEND_LOG_PARTITION_PRECREATE_AHEAD
Number of future spend-log partitions to pre-create on each cleanup run. Default is 7
SPEND_LOG_QUEUE_POLL_INTERVAL
Polling interval in seconds for spend log queue. Default is 2.0
SPEND_LOG_QUEUE_SIZE_THRESHOLD
Threshold for spend log queue size before processing. Default is 100
SPEND_LOG_CLEANUP_MAX_CONSECUTIVE_BATCH_FAILURES
Number of consecutive batch failures tolerated before the spend log cleanup run aborts. Default is 3
SPEND_LOG_CLEANUP_BATCH_FAILURE_BACKOFF_SECONDS
Backoff in seconds between failed spend log cleanup batches. Default is 0.5
SPEND_COUNTER_RESEED_LOCKS_MAX_SIZE
Max size of the per-counter LRU lock dict used to coalesce concurrent spend-counter reseeds from the DB on the enforcement path. Default is 10000.
COROUTINE_CHECKER_MAX_SIZE_IN_MEMORY
Maximum size for CoroutineChecker in-memory cache. Default is 1000
DEFAULT_SHARED_HEALTH_CHECK_TTL
Time-to-live in seconds for cached health check results in shared health check mode. Default is 300 (5 minutes)
DEFAULT_SHARED_HEALTH_CHECK_LOCK_TTL
Time-to-live in seconds for health check lock in shared health check mode. Default is 60 (1 minute)
ZSCALER_AI_GUARD_API_KEY
API key for Zscaler AI Guard service
ZSCALER_AI_GUARD_POLICY_ID
Policy ID for Zscaler AI Guard guardrails
ZSCALER_AI_GUARD_URL
Base URL for Zscaler AI Guard API. Default is https://api.us1.zseclipse.net/v1/detection/execute-policy