Streaming, Tool Use, and Structured Output

AI Navigate Original / 3/24/2026

💬 OpinionDeveloper Stack & Infrastructure

共有:

Key Points

SSE streaming improves perceived speed for chat/long-text UIs; use final confirmed text (not fragments) for DB/audit, and keep strict structured output out of streaming.
Tool Use: control on the app side which tool and how far it's allowed; add confirmation for side effects, strict JSON Schema args, and separate search/reference/update tools.
JSON structured output: fix the schema, demand "JSON only," keep temperature 0–0.2, validate app-side, and separate explanation vs. structured APIs.
Prompt Caching cuts cost for long fixed prompts; Batch API suits large async jobs. Adopt in order: structured output → streaming → tool use → caching/batch.

Mastering "Production-Like" Features With the Claude API

The Claude API has not only text generation but also practical features such as streaming responses, Tool Use (function calling), structured JSON output, Prompt Caching, and the Batch API. As of 2025, rather than turning a prompt tried in a chat UI into an API as-is, an implementation that designs response format, speed, cost, and reproducibility is important.

In this article, with Python-centered code examples, we organize "how to integrate it so it is actually easy to use." Since fine differences in the SDK are updated, always read the official documentation together when adopting.

SSE Streaming: First, Create an Experience That "Doesn't Make Users Wait"

For longish answers and summary generation, displaying progressively with Server-Sent Events (SSE) gives better UX than waiting for the full text to complete. It is especially effective for chat, review support, and minutes generation.

The idea of a basic implementation

Receive the Claude API stream on the backend
Relay to the frontend as-is, or format and send
Handle not just fragment text but also completion events and error events

from anthropic import Anthropic

client = Anthropic(api_key="YOUR_API_KEY")

with client.messages.stream(
    model="claude-sonnet-4-5",
    max_tokens=1200,
    temperature=0.2,
    messages=[
        {"role": "user", "content": "Tell me 3 key points for implementing SSE streaming"}
    ]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

    final_message = stream.get_final_message()

The point at implementation time is not to store fragments as-is. Since intermediate output mixes in restatements and unfinished sentences, it is safer to use the final confirmed text for DB storage and audit logs.

Cases streaming suits and doesn't suit

Case	Suitability	Reason
Chat UI	Suits	Perceived speed improves greatly
Long-text summarization	Suits	Users can start reading partway
JSON structured output	Slight caution	Intermediate fragments tend to be incomplete JSON
Batch aggregation	Doesn't suit	The merit of progressive display is thin

Especially for processing that uses structured output strictly, the practical separation is: streaming display is UI-only, and business processing is done after the final response is confirmed.

Tool Use: "Don't Leave External Processing Too Much to the Model"

Sign up to read the full article

Create a free account to access the full content of our original articles.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 3/27WeeklyView insight →📅 3/24DailyView insight →

Building ThreatPulse IDS: An AI-Powered Intrusion Detection System

Dev.to

The GPU Is the New Database

Dev.to

Per-User OAuth for AI Agents: Why It Matters and What to Look For

Dev.to

.NET 10 and Angular Signals Powered ‘Local-First’ Enterprise RAG (Vector Memory) Architecture

Dev.to

DeepSeek V4 on Huawei's Ascend 950: A Real Stress Test for China's AI Chip Ecosystem

Dev.to

Streaming, Tool Use, and Structured Output

Key Points

Mastering "Production-Like" Features With the Claude API

SSE Streaming: First, Create an Experience That "Doesn't Make Users Wait"

The idea of a basic implementation

Cases streaming suits and doesn't suit

Tool Use: "Don't Leave External Processing Too Much to the Model"

Sign up to read the full article

💡 Insights using this article

Related Articles

Building ThreatPulse IDS: An AI-Powered Intrusion Detection System

The GPU Is the New Database

Per-User OAuth for AI Agents: Why It Matters and What to Look For

.NET 10 and Angular Signals Powered ‘Local-First’ Enterprise RAG (Vector Memory) Architecture

DeepSeek V4 on Huawei's Ascend 950: A Real Stress Test for China's AI Chip Ecosystem

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer