GLM-4.6 - Overview - Z.AI DEVELOPER DOCUMENT (original) (raw)

GLM-4.6 achieves comprehensive enhancements across multiple domains, including real-world coding, long-context processing, reasoning, searching, writing, and agentic applications. Details are as follows:

Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex agentic tasks.
Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better real-world performance in applications such as Claude Code、Cline、Roo Code and Kilo Code, including improvements in generating visually polished front-end pages.
Advanced reasoning: GLM-4.6 shows a clear improvement in reasoning performance and supports tool use during inference, leading to stronger overall capability.
More capable agents: GLM-4.6 exhibits stronger performance in tool use and search-based agents, and integrates more effectively within agent frameworks.
Refined writing: Better aligns with human preferences in style and readability, and performs more naturally in role-playing scenarios.

Introducing GLM-4.6

1. Comprehensive Evaluation

In evaluations across 8 authoritative benchmarks for general model capabilities—including AIME 25, GPQA, LCB v6, HLE, and SWE-Bench Verified—GLM-4.6 achieves performance on par with Claude Sonnet 4/Claude Sonnet 4.6 on several leaderboards, solidifying its position as the top model developed in China. Description

2. Real-World Coding Evaluation

To better test the model’s capabilities in practical coding tasks, we conducted 74 real-world coding tests within the Claude Code environment. The results show that GLM-4.6 surpasses Claude Sonnet 4 and other domestic models in these real-world tests. Description In terms of average token consumption, GLM-4.6 is over 30% more efficient than GLM-4.5, achieving the lowest consumption rate among comparable models. To ensure transparency and credibility, Z.ai has publicly released all test questions and agent trajectories for verification and reproduction. (Link: https://huggingface.co/datasets/zai-org/CC-Bench-trajectories).

Usage

Resources

API Documentation: Learn how to call the API.

Quick Start

The following is a full sample code to help you onboard GLM-4.6 with ease.

cURL
Official Python SDK
Official Java SDK
OpenAI Python SDK

Basic Call

curl -X POST "https://api.z.ai/api/paas/v4/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key" \
  -d '{
    "model": "glm-4.6",
    "messages": [
      {
        "role": "user",
        "content": "As a marketing expert, please create an attractive slogan for my product."
      },
      {
        "role": "assistant",
        "content": "Sure, to craft a compelling slogan, please tell me more about your product."
      },
      {
        "role": "user",
        "content": "Z.AI Open Platform"
      }
    ],
    "thinking": {
      "type": "enabled"
    },
    "max_tokens": 4096,
    "temperature": 1.0
  }'

Streaming Call

curl -X POST "https://api.z.ai/api/paas/v4/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key" \
  -d '{
    "model": "glm-4.6",
    "messages": [
      {
        "role": "user",
        "content": "As a marketing expert, please create an attractive slogan for my product."
      },
      {
        "role": "assistant",
        "content": "Sure, to craft a compelling slogan, please tell me more about your product."
      },
      {
        "role": "user",
        "content": "Z.AI Open Platform"
      }
    ],
    "thinking": {
      "type": "enabled"
    },
    "stream": true,
    "max_tokens": 4096,
    "temperature": 1.0
  }'

Install SDK

# Install latest version
pip install zai-sdk

# Or specify version
pip install zai-sdk==0.2.3

Verify Installation

import zai
print(zai.__version__)

Basic Call

from zai import ZaiClient

client = ZaiClient(api_key="your-api-key")  # Your API Key

response = client.chat.completions.create(
    model="glm-4.6",
    messages=[
        {"role": "user", "content": "As a marketing expert, please create an attractive slogan for my product."},
        {"role": "assistant", "content": "Sure, to craft a compelling slogan, please tell me more about your product."},
        {"role": "user", "content": "Z.AI Open Platform"}
    ],
    thinking={
        "type": "enabled",
    },
    max_tokens=4096,
    temperature=1.0
)

# Get complete response
print(response.choices[0].message)

Streaming Call

from zai import ZaiClient

client = ZaiClient(api_key="your-api-key")  # Your API Key

response = client.chat.completions.create(
    model="glm-4.6",
    messages=[
        {"role": "user", "content": "As a marketing expert, please create an attractive slogan for my product."},
        {"role": "assistant", "content": "Sure, to craft a compelling slogan, please tell me more about your product."},
        {"role": "user", "content": "Z.AI Open Platform"}
    ],
    thinking={
        "type": "enabled",    # Optional: "disabled" or "enabled", default is "enabled"
    },
    stream=True,
    max_tokens=4096,
    temperature=0.6
)

# Stream response
for chunk in response:
    if chunk.choices[0].delta.reasoning_content:
        print(chunk.choices[0].delta.reasoning_content, end='', flush=True)

    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end='', flush=True)

Install SDK Maven

<dependency>
    <groupId>ai.z.openapi</groupId>
    <artifactId>zai-sdk</artifactId>
    <version>0.3.5</version>
</dependency>

Gradle (Groovy)

implementation 'ai.z.openapi:zai-sdk:0.3.5'

Basic Call

import ai.z.openapi.ZaiClient;
import ai.z.openapi.service.model.ChatCompletionCreateParams;
import ai.z.openapi.service.model.ChatCompletionResponse;
import ai.z.openapi.service.model.ChatMessage;
import ai.z.openapi.service.model.ChatMessageRole;
import ai.z.openapi.service.model.ChatThinking;
import java.util.Arrays;

public class BasicChat {
    public static void main(String[] args) {
        // Initialize client
        ZaiClient client = ZaiClient.builder().ofZAI()
            .apiKey("your-api-key")
            .build();

        // Create chat completion request
        ChatCompletionCreateParams request = ChatCompletionCreateParams.builder()
            .model("glm-4.6")
            .messages(Arrays.asList(
                ChatMessage.builder()
                    .role(ChatMessageRole.USER.value())
                    .content("As a marketing expert, please create an attractive slogan for my product.")
                    .build(),
                ChatMessage.builder()
                    .role(ChatMessageRole.ASSISTANT.value())
                    .content("Sure, to craft a compelling slogan, please tell me more about your product.")
                    .build(),
                ChatMessage.builder()
                    .role(ChatMessageRole.USER.value())
                    .content("Z.AI Open Platform")
                    .build()
            ))
            .thinking(ChatThinking.builder().type("enabled").build())
            .maxTokens(4096)
            .temperature(1.0f)
            .build();

        // Send request
        ChatCompletionResponse response = client.chat().createChatCompletion(request);

        // Get response
        if (response.isSuccess()) {
            Object reply = response.getData().getChoices().get(0).getMessage();
            System.out.println("AI Response: " + reply);
        } else {
            System.err.println("Error: " + response.getMsg());
        }
    }
}

Streaming Call

import ai.z.openapi.ZaiClient;
import ai.z.openapi.service.model.ChatCompletionCreateParams;
import ai.z.openapi.service.model.ChatCompletionResponse;
import ai.z.openapi.service.model.ChatMessage;
import ai.z.openapi.service.model.ChatMessageRole;
import ai.z.openapi.service.model.ChatThinking;
import ai.z.openapi.service.model.Delta;
import java.util.Arrays;

public class StreamingChat {
    public static void main(String[] args) {
        // Initialize client
        ZaiClient client = ZaiClient.builder().ofZAI()
            .apiKey("your-api-key")
            .build();

        // Create streaming chat completion request
        ChatCompletionCreateParams request = ChatCompletionCreateParams.builder()
            .model("glm-4.6")
            .messages(Arrays.asList(
                ChatMessage.builder()
                    .role(ChatMessageRole.USER.value())
                    .content("As a marketing expert, please create an attractive slogan for my product.")
                    .build(),
                ChatMessage.builder()
                    .role(ChatMessageRole.ASSISTANT.value())
                    .content("Sure, to craft a compelling slogan, please tell me more about your product.")
                    .build(),
                ChatMessage.builder()
                    .role(ChatMessageRole.USER.value())
                    .content("Z.AI Open Platform")
                    .build()
            ))
            .thinking(ChatThinking.builder().type("enabled").build())
            .stream(true)  // Enable streaming output
            .maxTokens(4096)
            .temperature(1.0f)
            .build();

        ChatCompletionResponse response = client.chat().createChatCompletion(request);

        if (response.isSuccess()) {
            response.getFlowable().subscribe(
                // Process streaming message data
                data -> {
                    if (data.getChoices() != null && !data.getChoices().isEmpty()) {
                        Delta delta = data.getChoices().get(0).getDelta();
                        System.out.print(delta + "\n");
                    }
                },
                // Process streaming response error
                error -> System.err.println("\nStream error: " + error.getMessage()),
                // Process streaming response completion event
                () -> System.out.println("\nStreaming response completed")
            );
        } else {
            System.err.println("Error: " + response.getMsg());
        }
    }
}

Install SDK

# Install or upgrade to latest version
pip install --upgrade 'openai>=1.0'

Verify Installation

python -c "import openai; print(openai.__version__)"

Usage Example

from openai import OpenAI

client = OpenAI(
    api_key="your-Z.AI-api-key",
    base_url="https://api.z.ai/api/paas/v4/"
)

completion = client.chat.completions.create(
    model="glm-4.6",
    messages=[
        {"role": "system", "content": "You are a smart and creative novelist"},
        {"role": "user", "content": "Please write a short fairy tale story as a fairy tale master"}
    ]
)

print(completion.choices[0].message.content)