Claude API in Practice: Streaming, Tool Use, and Structured Output

AI Navigate Original / 3/24/2026

💬 OpinionDeveloper Stack & Infrastructure

共有:

Key Points

SSE streaming can improve perceived speed for chat and long-form generation, but for saving and business processing it’s safest to use the final, confirmed response
Tool Use is convenient, but in real operations it’s important to include a check step for tool splitting, strict argument constraints via JSON Schema, and handling of side effects
For stable JSON structured output, combine “return only JSON” instructions with a low temperature and validation on the application side
Prompt Caching is effective for reducing cost when you have long, shared prompts, and Batch API is well-suited for large-scale asynchronous processing
A practical rollout order is: JSON structured output → Streaming → Tool Use → Caching/Batch, which balances implementation effort and impact

Master “production-like” features with the Claude API

The Claude API is not limited to generating text. It also provides practical, work-oriented features such as streaming responses, Tool Use (function calling), structured JSON output, Prompt Caching, and Batch API. As of 2025, rather than simply API-izing prompts you tested in a chat UI, it’s important to design the implementation around response format, speed, cost, and reproducibility.

In this article, using code examples centered on Python, we’ll整理 how to integrate these features so they’re truly easy to use in practice. SDK differences may change over time, so be sure to read the official documentation alongside this article when you start integrating.

SSE Streaming: Start by creating a “not to make users wait” experience

For long answers or summarization generation, showing output incrementally via Server-Sent Events (SSE) often improves UX compared to waiting for the full completion. This is especially effective for chat, review assistance, and meeting minutes generation.

Concept for basic implementation

Receive the Claude API stream on the backend
Relay it directly to the frontend, or format it before sending
Handle not only text fragments, but also completion and error events

from anthropic import Anthropic

client = Anthropic(api_key="YOUR_API_KEY")

with client.messages.stream(
    model="claude-sonnet-4-5",
    max_tokens=1200,
    temperature=0.2,
    messages=[
        {"role": "user", "content": "Tell me the key points of implementing SSE streaming in 3 items"}
    ]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

    final_message = stream.get_final_message()

A key point during implementation is to not save the partial output as-is. Intermediate output may include rewrites or unfinished sentences, so for DB storage and audit logs it’s safer to use the final, confirmed text.

Cases where streaming is a good fit / not a good fit

Case	Suitability	Reason
Chat UI	Suitable	Perceived speed improves significantly
Long-form summarization	Suitable	Users can start reading partway through
JSON structured output	Some caution	Intermediate fragments are likely to form incomplete JSON
Batch aggregation	Not suitable	The benefit of incremental display is diminished

Especially for processes that require strict structured output, a practical separation is: stream the output for the UI only, and perform business processing only after the final response has been confirmed.

Tool Use: Don’t let the model “do too much by itself”

Sign up to read the full article

Create a free account to access the full content of our original articles.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 3/27WeeklyView insight →📅 3/24DailyView insight →

Steno: Opensource AI powered intelligence layer for your confidential conversations.

Dev.to

How I Combine AI + Automation + Full-Stack Development to Build Smarter Systems

Dev.to

🚀 Just built a beginner-friendly AI tool called Mini AI Auto Trainer 🤖

Dev.to

Nine Seconds, No Backups: An Agent’s “Confession”

Dev.to

Vector Database Là Gì? Giải Mã "Trái Tim" Của Kỷ Nguyên AI

Dev.to

Claude API in Practice: Streaming, Tool Use, and Structured Output

Key Points

Master “production-like” features with the Claude API

SSE Streaming: Start by creating a “not to make users wait” experience

Concept for basic implementation

Cases where streaming is a good fit / not a good fit

Tool Use: Don’t let the model “do too much by itself”

Sign up to read the full article

💡 Insights using this article

Related Articles

Steno: Opensource AI powered intelligence layer for your confidential conversations.

How I Combine AI + Automation + Full-Stack Development to Build Smarter Systems

🚀 Just built a beginner-friendly AI tool called Mini AI Auto Trainer 🤖

Nine Seconds, No Backups: An Agent’s “Confession”

Vector Database Là Gì? Giải Mã "Trái Tim" Của Kỷ Nguyên AI

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer