Structured Outputs Guide - Perplexity (original) (raw)

Overview

Structured outputs enable you to enforce specific response formats from Perplexity’s models, ensuring consistent, machine-readable data that can be directly integrated into your applications without manual parsing. We currently support two types of structured outputs: JSON Schema and Regex. LLM responses will work to match the specified format, except for the following cases:

Enabling the structured outputs can be done by adding a response_format field in the request: JSON Schema

Regex (only available for sonar right now)

Examples

1. Financial Analysis with JSON Schema

Best Practices

Generating responses in a JSON Format

For Python users, we recommend using the Pydantic library to generate JSON schema. Unsupported JSON Schemas Recursive JSON schema is not supported. As a result of that, unconstrained objects are not supported either. Here’s a few example of unsupported schemas:

# UNSUPPORTED!

from typing import Any

class UnconstrainedDict(BaseModel):
   unconstrained: dict[str, Any]

class RecursiveJson(BaseModel):
   value: str
   child: list["RecursiveJson"]

Generating responses using a regex

Supported Regex

Unsupported Regex

Perplexity’s JSON Schema Implementation

Perplexity’s structured outputs implementation has several key differences compared to other providers:

Simplified Schema Definition

Other Providers:

{
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "financial_data",
      "strict": true,
      "schema": { /* your schema */ }
    }
  }
}

Perplexity:

{
  "response_format": {
    "type": "json_schema", 
    "json_schema": {
      "schema": { /* your schema */ }
    }
  }
}

Enhanced Error Handling

Schema Compatibility

While Perplexity supports standard JSON Schema syntax, some advanced features may not be available:

This approach prioritizes reliability and performance while maintaining compatibility with most common JSON Schema use cases.

Structured Outputs for Reasoning Models

When using structured outputs with reasoning models like sonar-reasoning-pro, the response will include a <think> section containing reasoning tokens, immediately followed by the structured output. The response_format parameter does not remove these reasoning tokens from the output, so the final response will need to be parsed manually. Sample Response:

<think>
I need to provide information about France in a structured JSON format with specific fields: country, capital, population, official_language.

For France:
- Country: France
- Capital: Paris
- Population: About 67 million (as of 2023)
- Official Language: French

Let me format this information as required.
</think>
{"country":"France","capital":"Paris","population":67750000,"official_language":"French"}

For a reusable implementation to extract JSON from reasoning model outputs, see our example utility on GitHub.