X (Twitter)

Agent skill guide for building, debugging, and optimizing Claude API and Anthropic SDK apps with prompt caching, streaming, tool use, and model migration workflows.

Source: Content adapted from anthropics/skills (MIT).

This skill helps you build LLM-powered applications with Claude. Choose the right surface based on your needs, detect the project language, then read the relevant language-specific documentation.

Before You Start

Scan the target file (or, if no target file, the prompt and project) for non-Anthropic provider markers - import openai, from openai, langchain_openai, OpenAI(, gpt-4, gpt-5, file names like agent-openai.py or *-generic.py, or any explicit instruction to keep the code provider-neutral. If you find any, stop and tell the user that this skill produces Claude/Anthropic SDK code; ask whether they want to switch the file to Claude or want a non-Claude implementation. Do not edit a non-Anthropic file with Anthropic SDK calls.

Output Requirement

When the user asks you to add, modify, or implement a Claude feature, your code must call Claude through one of:

The official Anthropic SDK for the project's language (anthropic, @anthropic-ai/sdk, com.anthropic.*, etc.). This is the default whenever a supported SDK exists for the project.
Raw HTTP (curl, requests, fetch, httpx, etc.) - only when the user explicitly asks for cURL/REST/raw HTTP, the project is a shell/cURL project, or the language has no official SDK.

Never mix the two - don't reach for requests/fetch in a Python or TypeScript project just because it feels lighter. Never fall back to OpenAI-compatible shims.

Never guess SDK usage. Function names, class names, namespaces, method signatures, and import paths must come from explicit documentation - either the {lang}/ files in this skill or the official SDK repositories or documentation links listed in shared/live-sources.md. If the binding you need is not explicitly documented in the skill files, WebFetch the relevant SDK repo from shared/live-sources.md before writing code. Do not infer Ruby/Java/Go/PHP/C# APIs from cURL shapes or from another language's SDK.

Defaults

Unless the user requests otherwise:

For the Claude model version, please use Claude Opus 4.7, which you can access via the exact model string claude-opus-4-7. Please default to using adaptive thinking (thinking: {type: "adaptive"}) for anything remotely complicated. And finally, please default to streaming for any request that may involve long input, long output, or high max_tokens - it prevents hitting request timeouts. Use the SDK's .get_final_message() / .finalMessage() helper to get the complete response if you don't need to handle individual stream events

Subcommands

If the User Request at the bottom of this prompt is a bare subcommand string (no prose), search every Subcommands table in this document - including any in sections appended below - and follow the matching Action column directly. This lets users invoke specific flows via /claude-api <subcommand>. If no table in the document matches, treat the request as normal prose.

Language Detection

Before reading code examples, determine which language the user is working in:

Look at project files to infer the language:
- *.py, requirements.txt, pyproject.toml, setup.py, Pipfile -> Python - read from python/
- *.ts, *.tsx, package.json, tsconfig.json -> TypeScript - read from typescript/
- *.js, *.jsx (no .ts files present) -> TypeScript - JS uses the same SDK, read from typescript/
- *.java, pom.xml, build.gradle -> Java - read from java/
- *.kt, *.kts, build.gradle.kts -> Java - Kotlin uses the Java SDK, read from java/
- *.scala, build.sbt -> Java - Scala uses the Java SDK, read from java/
- *.go, go.mod -> Go - read from go/
- *.rb, Gemfile -> Ruby - read from ruby/
- *.cs, *.csproj -> C# - read from csharp/
- *.php, composer.json -> PHP - read from php/
If multiple languages detected (e.g., both Python and TypeScript files):
- Check which language the user's current file or question relates to
- If still ambiguous, ask: "I detected both Python and TypeScript files. Which language are you using for the Claude API integration?"
If language can't be inferred (empty project, no source files, or unsupported language):
- Use AskUserQuestion with options: Python, TypeScript, Java, Go, Ruby, cURL/raw HTTP, C#, PHP
- If AskUserQuestion is unavailable, default to Python examples and note: "Showing Python examples. Let me know if you need a different language."
If unsupported language detected (Rust, Swift, C++, Elixir, etc.):
- Suggest cURL/raw HTTP examples from curl/ and note that community SDKs may exist
- Offer to show Python or TypeScript examples as reference implementations
If user needs cURL/raw HTTP examples, read from curl/.

Language-Specific Feature Support

Language	Tool Runner	Managed Agents	Notes
Python	Yes (beta)	Yes (beta)	Full support - `@beta_tool` decorator
TypeScript	Yes (beta)	Yes (beta)	Full support - `betaZodTool` + Zod
Java	Yes (beta)	Yes (beta)	Beta tool use with annotated classes
Go	Yes (beta)	Yes (beta)	`BetaToolRunner` in `toolrunner` pkg
Ruby	Yes (beta)	Yes (beta)	`BaseTool` + `tool_runner` in beta
C#	No	No	Official SDK
PHP	Yes (beta)	Yes (beta)	`BetaRunnableTool` + `toolRunner()`
cURL	N/A	Yes (beta)	Raw HTTP, no SDK features

Managed Agents code examples: dedicated language-specific READMEs are provided for Python, TypeScript, Go, Ruby, PHP, Java, and cURL ({lang}/managed-agents/README.md, curl/managed-agents.md). Read your language's README plus the language-agnostic shared/managed-agents-*.md concept files. Agents are persistent - create once, reference by ID. Store the agent ID returned by agents.create and pass it to every subsequent sessions.create; do not call agents.create in the request path. The Anthropic CLI is one convenient way to create agents and environments from version-controlled YAML - its URL is in shared/live-sources.md. If a binding you need isn't shown in the README, WebFetch the relevant entry from shared/live-sources.md rather than guess. C# does not currently have Managed Agents support; use cURL-style raw HTTP requests against the API.

Which Surface Should I Use?

Start simple. Default to the simplest tier that meets your needs. Single API calls and workflows handle most use cases - only reach for agents when the task genuinely requires open-ended, model-driven exploration.

Use Case	Tier	Recommended Surface	Why
Classification, summarization, extraction, Q&A	Single LLM call	Claude API	One request, one response
Batch processing or embeddings	Single LLM call	Claude API	Specialized endpoints
Multi-step pipelines with code-controlled logic	Workflow	Claude API + tool use	You orchestrate the loop
Custom agent with your own tools	Agent	Claude API + tool use	Maximum flexibility
Server-managed stateful agent with workspace	Agent	Managed Agents	Anthropic runs the loop and hosts the tool-execution sandbox
Persisted, versioned agent configs	Agent	Managed Agents	Agents are stored objects; sessions pin to a version
Long-running multi-turn agent with file mounts	Agent	Managed Agents	Per-session containers, SSE event stream, Skills + MCP

Note: Managed Agents is the right choice when you want Anthropic to run the agent loop and host the container where tools execute - file ops, bash, code execution all run in the per-session workspace. If you want to host the compute yourself or run your own custom tool runtime, Claude API + tool use is the right choice - use the tool runner for automatic loop handling, or the manual loop for fine-grained control (approval gates, custom logging, conditional execution).

Third-party providers (Amazon Bedrock, Google Vertex AI, Microsoft Foundry): Managed Agents is not available on Bedrock, Vertex, or Foundry. If you are deploying through any third-party provider, use Claude API + tool use for all use cases - including ones where Managed Agents would otherwise be the recommended surface.

Decision Tree

What does your application need?

0. Are you deploying through Amazon Bedrock, Google Vertex AI, or Microsoft Foundry?
    Yes -> Claude API (+ tool use for agents) - Managed Agents is 1P only.
   No -> continue.

1. Single LLM call (classification, summarization, extraction, Q&A)
    Claude API - one request, one response

2. Do you want Anthropic to run the agent loop and host a per-session
   container where Claude executes tools (bash, file ops, code)?
    Yes -> Managed Agents - server-managed sessions, persisted agent configs,
       SSE event stream, Skills + MCP, file mounts.
       Examples: "stateful coding agent with a workspace per task",
                 "long-running research agent that streams events to a UI",
                 "agent with persisted, versioned config used across many sessions"

3. Workflow (multi-step, code-orchestrated, with your own tools)
    Claude API with tool use - you control the loop

4. Open-ended agent (model decides its own trajectory, your own tools, you host the compute)
    Claude API agentic loop (maximum flexibility)

Should I Build an Agent?

Before choosing the agent tier, check all four criteria:

Complexity - Is the task multi-step and hard to fully specify in advance? (e.g., "turn this design doc into a PR" vs. "extract the title from this PDF")
Value - Does the outcome justify higher cost and latency?
Viability - Is Claude capable at this task type?
Cost of error - Can errors be caught and recovered from? (tests, review, rollback)

If the answer is "no" to any of these, stay at a simpler tier (single call or workflow).

Architecture

Everything goes through POST /v1/messages. Tools and output constraints are features of this single endpoint - not separate APIs.

User-defined tools - You define tools (via decorators, Zod schemas, or raw JSON), and the SDK's tool runner handles calling the API, executing your functions, and looping until Claude is done. For full control, you can write the loop manually.

Server-side tools - Anthropic-hosted tools that run on Anthropic's infrastructure. Code execution is fully server-side (declare it in tools, Claude runs code automatically). Computer use can be server-hosted or self-hosted.

Structured outputs - Constrains the Messages API response format (output_config.format) and/or tool parameter validation (strict: true). The recommended approach is client.messages.parse() which validates responses against your schema automatically. Note: the old output_format parameter is deprecated; use output_config: {format: {...}} on messages.create().

Supporting endpoints - Batches (POST /v1/messages/batches), Files (POST /v1/files), Token Counting, and Models (GET /v1/models, GET /v1/models/{id} - live capability/context-window discovery) feed into or support Messages API requests.

Current Models (cached: 2026-04-15)

Model	Model ID	Context	Input $/1M	Output $/1M
Claude Opus 4.7	`claude-opus-4-7`	1M	$5.00	$25.00
Claude Opus 4.6	`claude-opus-4-6`	1M	$5.00	$25.00
Claude Sonnet 4.6	`claude-sonnet-4-6`	1M	$3.00	$15.00
Claude Haiku 4.5	`claude-haiku-4-5`	200K	$1.00	$5.00

ALWAYS use claude-opus-4-7 unless the user explicitly names a different model. This is non-negotiable. Do not use claude-sonnet-4-6, claude-sonnet-4-5, or any other model unless the user literally says "use sonnet" or "use haiku". Never downgrade for cost - that's the user's decision, not yours.

CRITICAL: Use only the exact model ID strings from the table above - they are complete as-is. Do not append date suffixes. For example, use claude-sonnet-4-5, never claude-sonnet-4-5-20250514 or any other date-suffixed variant you might recall from training data. If the user requests an older model not in the table (e.g., "opus 4.5", "sonnet 3.7"), read shared/models.md for the exact ID - do not construct one yourself.

A note: if any of the model strings above look unfamiliar to you, that's to be expected - that just means they were released after your training data cutoff. Rest assured they are real models; we wouldn't mess with you like that.

Live capability lookup: The table above is cached. When the user asks "what's the context window for X", "does X support vision/thinking/effort", or "which models support Y", query the Models API (client.models.retrieve(id) / client.models.list()) - see shared/models.md for the field reference and capability-filter examples.

Thinking & Effort (Quick Reference)

Opus 4.7 - Adaptive thinking only: Use thinking: {type: "adaptive"}. thinking: {type: "enabled", budget_tokens: N} returns a 400 on Opus 4.7 - adaptive is the only on-mode. {type: "disabled"} and omitting thinking both work. Sampling parameters (temperature, top_p, top_k) are also removed and will 400. See shared/model-migration.md -> Migrating to Opus 4.7 for the full breaking-change list. Opus 4.6 - Adaptive thinking (recommended): Use thinking: {type: "adaptive"}. Claude dynamically decides when and how much to think. No budget_tokens needed - budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6 and should not be used for new code. Adaptive thinking also automatically enables interleaved thinking (no beta header needed). When the user asks for "extended thinking", a "thinking budget", or budget_tokens: always use Opus 4.7 or 4.6 with thinking: {type: "adaptive"}. The concept of a fixed token budget for thinking is deprecated - adaptive thinking replaces it. Do NOT use budget_tokens for new 4.6/4.7 code and do NOT switch to an older model. Gradual-migration carve-out: budget_tokens is still functional on Opus 4.6 and Sonnet 4.6 as a transitional escape hatch - if you're migrating existing code and need a hard token ceiling before you've tuned effort, see shared/model-migration.md -> Transitional escape hatch. Note: this carve-out does not apply to Opus 4.7 - budget_tokens is fully removed there. Effort parameter (GA, no beta header): Controls thinking depth and overall token spend via output_config: {effort: "low"|"medium"|"high"|"max"} (inside output_config, not top-level). Default is high (equivalent to omitting it). max is Opus-tier only (Opus 4.6 and later - not Sonnet or Haiku). Opus 4.7 adds "xhigh" (between high and max) - the best setting for most coding and agentic use cases on 4.7, and the default in Claude Code; use a minimum of high for most intelligence-sensitive work. Works on Opus 4.5, Opus 4.6, Opus 4.7, and Sonnet 4.6. Will error on Sonnet 4.5 / Haiku 4.5. On Opus 4.7, effort matters more than on any prior Opus - re-tune it when migrating. Combine with adaptive thinking for the best cost-quality tradeoffs. Lower effort means fewer and more-consolidated tool calls, less preamble, and terser confirmations - high is often the sweet spot balancing quality and token efficiency; use max when correctness matters more than cost; use low for subagents or simple tasks.

Opus 4.7 - thinking content omitted by default: thinking blocks still stream but their text is empty unless you opt in with thinking: {type: "adaptive", display: "summarized"} (default is "omitted"). Silent change - no error. If you stream reasoning to users, the default looks like a long pause before output; set "summarized" to restore visible progress.

Task Budgets (beta, Opus 4.7): output_config: {task_budget: {type: "tokens", total: N}} tells the model how many tokens it has for a full agentic loop - it sees a running countdown and self-moderates (minimum 20,000; beta header task-budgets-2026-03-13). Distinct from max_tokens, which is an enforced per-response ceiling the model is not aware of. See shared/model-migration.md -> Task Budgets.

Sonnet 4.6: Supports adaptive thinking (thinking: {type: "adaptive"}). budget_tokens is deprecated on Sonnet 4.6 - use adaptive thinking instead.

Older models (only if explicitly requested): If the user specifically asks for Sonnet 4.5 or another older model, use thinking: {type: "enabled", budget_tokens: N}. budget_tokens must be less than max_tokens (minimum 1024). Never choose an older model just because the user mentions budget_tokens - use Opus 4.7 with adaptive thinking instead.

Compaction (Quick Reference)

Beta, Opus 4.7, Opus 4.6, and Sonnet 4.6. For long-running conversations that may exceed the 1M context window, enable server-side compaction. The API automatically summarizes earlier context when it approaches the trigger threshold (default: 150K tokens). Requires beta header compact-2026-01-12.

Critical: Append response.content (not just the text) back to your messages on every turn. Compaction blocks in the response must be preserved - the API uses them to replace the compacted history on the next request. Extracting only the text string and appending that will silently lose the compaction state.

See {lang}/claude-api/README.md (Compaction section) for code examples. Full docs via WebFetch in shared/live-sources.md.

Prompt Caching (Quick Reference)

Prefix match. Any byte change anywhere in the prefix invalidates everything after it. Render order is tools -> system -> messages. Keep stable content first (frozen system prompt, deterministic tool list), put volatile content (timestamps, per-request IDs, varying questions) after the last cache_control breakpoint.

Top-level auto-caching (cache_control: {type: "ephemeral"} on messages.create()) is the simplest option when you don't need fine-grained placement. Max 4 breakpoints per request. Minimum cacheable prefix is ~1024 tokens - shorter prefixes silently won't cache.

Verify with usage.cache_read_input_tokens - if it's zero across repeated requests, a silent invalidator is at work (datetime.now() in system prompt, unsorted JSON, varying tool set).

For placement patterns, architectural guidance, and the silent-invalidator audit checklist: read shared/prompt-caching.md. Language-specific syntax: {lang}/claude-api/README.md (Prompt Caching section).

Managed Agents (Beta)

Managed Agents is a third surface: server-managed stateful agents with Anthropic-hosted tool execution. You create a persisted, versioned Agent config (POST /v1/agents), then start Sessions that reference it. Each session provisions a container as the agent's workspace - bash, file ops, and code execution run there; the agent loop itself runs on Anthropic's orchestration layer and acts on the container via tools. The session streams events; you send messages and tool results back.

Managed Agents is first-party only. It is not available on Amazon Bedrock, Google Vertex AI, or Microsoft Foundry. For agents on third-party providers, use Claude API + tool use.

Mandatory flow: Agent (once) -> Session (every run). model/system/tools live on the agent, never the session. See shared/managed-agents-overview.md for the full reading guide, beta headers, and pitfalls.

Beta headers: managed-agents-2026-04-01 - the SDK sets this automatically for all client.beta.{agents,environments,sessions,vaults,memory_stores}.* calls. Skills API uses skills-2025-10-02 and Files API uses files-api-2025-04-14, but you don't need to explicitly pass those in for endpoints other than /v1/skills and /v1/files.

Subcommands - invoke directly with /claude-api <subcommand>:

Subcommand	Action
`managed-agents-onboard`	Walk the user through setting up a Managed Agent from scratch. Read `shared/managed-agents-onboarding.md` immediately and follow its interview script: mental model -> know-or-explore branch -> template config -> session setup -> emit code. Do not summarize - run the interview.

Reading guide: Start with shared/managed-agents-overview.md, then the topical shared/managed-agents-*.md files (core, environments, tools, events, outcomes, multiagent, webhooks, memory, client-patterns, onboarding, api-reference). For Python, TypeScript, Go, Ruby, PHP, and Java, read {lang}/managed-agents/README.md for code examples. For cURL, read curl/managed-agents.md. Agents are persistent - create once, reference by ID. Store the agent ID returned by agents.create and pass it to every subsequent sessions.create; do not call agents.create in the request path. The Anthropic CLI is one convenient way to create agents and environments from version-controlled YAML (URL in shared/live-sources.md). If a binding you need isn't shown in the language README, WebFetch the relevant entry from shared/live-sources.md rather than guess. C# does not currently have Managed Agents support; use raw HTTP from curl/managed-agents.md as a reference.

When the user wants to set up a Managed Agent from scratch (e.g. "how do I get started", "walk me through creating one", "set up a new agent"): read shared/managed-agents-onboarding.md and run its interview - same flow as the managed-agents-onboard subcommand.

When the user asks "how do I write the client code for X": reach for shared/managed-agents-client-patterns.md - covers lossless stream reconnect, processed_at queued/processed gate, interrupt, tool_confirmation round-trip, the correct idle/terminated break gate, post-idle status race, stream-first ordering, file-mount gotchas, keeping credentials host-side via custom tools, etc.

Reading Guide

After detecting the language, read the relevant files based on what the user needs:

Quick Task Reference

Single text classification/summarization/extraction/Q&A: -> Read only {lang}/claude-api/README.md

Chat UI or real-time response display: -> Read {lang}/claude-api/README.md + {lang}/claude-api/streaming.md

Long-running conversations (may exceed context window): -> Read {lang}/claude-api/README.md - see Compaction section Migrating to a newer model (Opus 4.7 / Opus 4.6 / Sonnet 4.6) or replacing a retired model: -> Read shared/model-migration.md Prompt caching / optimize caching / "why is my cache hit rate low": -> Read shared/prompt-caching.md + {lang}/claude-api/README.md (Prompt Caching section)

Function calling / tool use / agents: -> Read {lang}/claude-api/README.md + shared/tool-use-concepts.md + {lang}/claude-api/tool-use.md

Agent design (tool surface, context management, caching strategy): -> Read shared/agent-design.md

Batch processing (non-latency-sensitive): -> Read {lang}/claude-api/README.md + {lang}/claude-api/batches.md

File uploads across multiple requests: -> Read {lang}/claude-api/README.md + {lang}/claude-api/files-api.md

Managed Agents (server-managed stateful agents with workspace): -> Read shared/managed-agents-overview.md + the rest of the shared/managed-agents-*.md files. For Python, TypeScript, Go, Ruby, PHP, and Java, read {lang}/managed-agents/README.md for code examples. For cURL, read curl/managed-agents.md. Agents are persistent - create once, reference by ID. Store the agent ID returned by agents.create and pass it to every subsequent sessions.create; do not call agents.create in the request path. The Anthropic CLI is one convenient way to create agents and environments from version-controlled YAML (URL in shared/live-sources.md). If a binding you need isn't shown in the language README, WebFetch the relevant entry from shared/live-sources.md rather than guess. C# does not currently support Managed Agents - use raw HTTP from curl/managed-agents.md as a reference.

Claude API (Full File Reference)

Read the language-specific Claude API folder ({language}/claude-api/):

{language}/claude-api/README.md - Read this first. Installation, quick start, common patterns, error handling.
shared/tool-use-concepts.md - Read when the user needs function calling, code execution, memory, or structured outputs. Covers conceptual foundations.
shared/agent-design.md - Read when designing an agent: bash vs. dedicated tools, programmatic tool calling, tool search/skills, context editing vs. compaction vs. memory, caching principles.
{language}/claude-api/tool-use.md - Read for language-specific tool use code examples (tool runner, manual loop, code execution, memory, structured outputs).
{language}/claude-api/streaming.md - Read when building chat UIs or interfaces that display responses incrementally.
{language}/claude-api/batches.md - Read when processing many requests offline (not latency-sensitive). Runs asynchronously at 50% cost.
{language}/claude-api/files-api.md - Read when sending the same file across multiple requests without re-uploading.
shared/prompt-caching.md - Read when adding or optimizing prompt caching. Covers prefix-stability design, breakpoint placement, and anti-patterns that silently invalidate cache.
shared/error-codes.md - Read when debugging HTTP errors or implementing error handling.
shared/model-migration.md - Read when upgrading to newer models, replacing retired models, or translating budget_tokens / prefill patterns to the current API.
shared/live-sources.md - WebFetch URLs for fetching the latest official documentation.

Note: For Java, Go, Ruby, C#, PHP, and cURL - these have a single file each covering all basics. Read that file plus shared/tool-use-concepts.md and shared/error-codes.md as needed.

Note: For the Managed Agents file reference, see the ## Managed Agents (Beta) section above - it lists every shared/managed-agents-*.md file and the language-specific READMEs.

When to Use WebFetch

Use WebFetch to get the latest documentation when:

User asks for "latest" or "current" information
Cached data seems incorrect
User asks about features not covered here

Live documentation URLs are in shared/live-sources.md.

Common Pitfalls

Don't truncate inputs when passing files or content to the API. If the content is too long to fit in the context window, notify the user and discuss options (chunking, summarization, etc.) rather than silently truncating.
Opus 4.7 thinking: Adaptive only. thinking: {type: "enabled", budget_tokens: N} returns 400 on Opus 4.7 - budget_tokens is fully removed there (along with temperature, top_p, top_k). Use thinking: {type: "adaptive"}.
Opus 4.6 / Sonnet 4.6 thinking: Use thinking: {type: "adaptive"} - do NOT use budget_tokens for new 4.6 code (deprecated on both Opus 4.6 and Sonnet 4.6; for gradual migration of existing code, see the transitional escape hatch in shared/model-migration.md - note this carve-out does not apply to Opus 4.7). For older models, budget_tokens must be less than max_tokens (minimum 1024). This will throw an error if you get it wrong.
4.6/4.7 family prefill removed: Assistant message prefills (last-assistant-turn prefills) return a 400 error on Opus 4.6, Opus 4.7, and Sonnet 4.6. Use structured outputs (output_config.format) or system prompt instructions to control response format instead.
Confirm migration scope before editing: When a user asks to migrate code to a newer Claude model without naming a specific file, directory, or file list, ask which scope to apply first - the entire working directory, a specific subdirectory, or a specific set of files. Do not start editing until the user confirms. Imperative phrasings like "migrate my codebase", "move my project to X", "upgrade to Sonnet 4.6", or bare "migrate to Opus 4.7" are still ambiguous - they tell you what to do but not where, so ask. Proceed without asking only when the prompt names an exact file, a specific directory, or an explicit file list ("migrate app.py", "migrate everything under services/", "update a.py and b.py"). See shared/model-migration.md Step 0.
max_tokens defaults: Don't lowball max_tokens - hitting the cap truncates output mid-thought and requires a retry. For non-streaming requests, default to ~16000 (keeps responses under SDK HTTP timeouts). For streaming requests, default to ~64000 (timeouts aren't a concern, so give the model room). Only go lower when you have a hard reason: classification (~256), cost caps, or deliberately short outputs.
128K output tokens: Opus 4.6 and Opus 4.7 support up to 128K max_tokens, but the SDKs require streaming for values that large to avoid HTTP timeouts. Use .stream() with .get_final_message() / .finalMessage().
Tool call JSON parsing (4.6/4.7 family): Opus 4.6, Opus 4.7, and Sonnet 4.6 may produce different JSON string escaping in tool call input fields (e.g., Unicode or forward-slash escaping). Always parse tool inputs with json.loads() / JSON.parse() - never do raw string matching on the serialized input.
Structured outputs (all models): Use output_config: {format: {...}} instead of the deprecated output_format parameter on messages.create(). This is a general API change, not 4.6-specific.
Don't reimplement SDK functionality: The SDK provides high-level helpers - use them instead of building from scratch. Specifically: use stream.finalMessage() instead of wrapping .on() events in new Promise(); use typed exception classes (Anthropic.RateLimitError, etc.) instead of string-matching error messages; use SDK types (Anthropic.MessageParam, Anthropic.Tool, Anthropic.Message, etc.) instead of redefining equivalent interfaces.
Don't define custom types for SDK data structures: The SDK exports types for all API objects. Use Anthropic.MessageParam for messages, Anthropic.Tool for tool definitions, Anthropic.ToolUseBlock / Anthropic.ToolResultBlockParam for tool results, Anthropic.Message for responses. Defining your own interface ChatMessage { role: string; content: unknown } duplicates what the SDK already provides and loses type safety.
Report and document output: For tasks that produce reports, documents, or visualizations, the code execution sandbox has python-docx, python-pptx, matplotlib, pillow, and pypdf pre-installed. Claude can generate formatted files (DOCX, PDF, charts) and return them via the Files API - consider this for "report" or "document" type requests instead of plain stdout text.

Resource Files

LICENSE.txt

Download LICENSE.txt

Binary resource

csharp/claude-api.md

Download csharp/claude-api.md

Binary resource

curl/examples.md

Download curl/examples.md

Binary resource

curl/managed-agents.md

Download curl/managed-agents.md

Binary resource

go/claude-api.md

Download go/claude-api.md

Binary resource

go/managed-agents/README.md

Download go/managed-agents/README.md

Binary resource

java/claude-api.md

Download java/claude-api.md

Binary resource

java/managed-agents/README.md

Download java/managed-agents/README.md

Binary resource

php/claude-api.md

Download php/claude-api.md

Binary resource

php/managed-agents/README.md

Download php/managed-agents/README.md

Binary resource

python/claude-api/README.md

Download python/claude-api/README.md

Binary resource

python/claude-api/batches.md

Download python/claude-api/batches.md

# Message Batches API — Python

The Batches API (`POST /v1/messages/batches`) processes Messages API requests asynchronously at 50% of standard prices.

## Key Facts

- Up to 100,000 requests or 256 MB per batch
- Most batches complete within 1 hour; maximum 24 hours
- Results available for 29 days after creation
- 50% cost reduction on all token usage
- All Messages API features supported (vision, tools, caching, etc.)

---

## Create a Batch

```python
import anthropic
from anthropic.types.message_create_params import MessageCreateParamsNonStreaming
from anthropic.types.messages.batch_create_params import Request

client = anthropic.Anthropic()

message_batch = client.messages.batches.create(
    requests=[
        Request(
            custom_id="request-1",
            params=MessageCreateParamsNonStreaming(
                model="claude-opus-4-7",
                max_tokens=16000,
                messages=[{"role": "user", "content": "Summarize climate change impacts"}]
            )
        ),
        Request(
            custom_id="request-2",
            params=MessageCreateParamsNonStreaming(
                model="claude-opus-4-7",
                max_tokens=16000,
                messages=[{"role": "user", "content": "Explain quantum computing basics"}]
            )
        ),
    ]
)

print(f"Batch ID: {message_batch.id}")
print(f"Status: {message_batch.processing_status}")

Poll for Completion

import time

while True:
    batch = client.messages.batches.retrieve(message_batch.id)
    if batch.processing_status == "ended":
        break
    print(f"Status: {batch.processing_status}, processing: {batch.request_counts.processing}")
    time.sleep(60)

print("Batch complete!")
print(f"Succeeded: {batch.request_counts.succeeded}")
print(f"Errored: {batch.request_counts.errored}")

Retrieve Results

Note: Examples below use match/case syntax, requiring Python 3.10+. For earlier versions, use if/elif chains instead.

for result in client.messages.batches.results(message_batch.id):
    match result.result.type:
        case "succeeded":
            msg = result.result.message
            text = next((b.text for b in msg.content if b.type == "text"), "")
            print(f"[{result.custom_id}] {text[:100]}")
        case "errored":
            if result.result.error.type == "invalid_request":
                print(f"[{result.custom_id}] Validation error - fix request and retry")
            else:
                print(f"[{result.custom_id}] Server error - safe to retry")
        case "canceled":
            print(f"[{result.custom_id}] Canceled")
        case "expired":
            print(f"[{result.custom_id}] Expired - resubmit")

Cancel a Batch

cancelled = client.messages.batches.cancel(message_batch.id)
print(f"Status: {cancelled.processing_status}")  # "canceling"

Batch with Prompt Caching

shared_system = [
    {"type": "text", "text": "You are a literary analyst."},
    {
        "type": "text",
        "text": large_document_text,  # Shared across all requests
        "cache_control": {"type": "ephemeral"}
    }
]

message_batch = client.messages.batches.create(
    requests=[
        Request(
            custom_id=f"analysis-{i}",
            params=MessageCreateParamsNonStreaming(
                model="claude-opus-4-7",
                max_tokens=16000,
                system=shared_system,
                messages=[{"role": "user", "content": question}]
            )
        )
        for i, question in enumerate(questions)
    ]
)

Full End-to-End Example

import anthropic
import time
from anthropic.types.message_create_params import MessageCreateParamsNonStreaming
from anthropic.types.messages.batch_create_params import Request

client = anthropic.Anthropic()

# 1. Prepare requests
items_to_classify = [
    "The product quality is excellent!",
    "Terrible customer service, never again.",
    "It's okay, nothing special.",
]

requests = [
    Request(
        custom_id=f"classify-{i}",
        params=MessageCreateParamsNonStreaming(
            model="claude-haiku-4-5",
            max_tokens=50,
            messages=[{
                "role": "user",
                "content": f"Classify as positive/negative/neutral (one word): {text}"
            }]
        )
    )
    for i, text in enumerate(items_to_classify)
]

# 2. Create batch
batch = client.messages.batches.create(requests=requests)
print(f"Created batch: {batch.id}")

# 3. Wait for completion
while True:
    batch = client.messages.batches.retrieve(batch.id)
    if batch.processing_status == "ended":
        break
    time.sleep(10)

# 4. Collect results
results = {}
for result in client.messages.batches.results(batch.id):
    if result.result.type == "succeeded":
        msg = result.result.message
        results[result.custom_id] = next((b.text for b in msg.content if b.type == "text"), "")

for custom_id, classification in sorted(results.items()):
    print(f"{custom_id}: {classification}")


### python/claude-api/files-api.md

[Download python/claude-api/files-api.md](/skills/claude-api/python/claude-api/files-api.md)

```markdown
# Files API — Python

The Files API uploads files for use in Messages API requests. Reference files via `file_id` in content blocks, avoiding re-uploads across multiple API calls.

**Beta:** Pass `betas=["files-api-2025-04-14"]` in your API calls (the SDK sets the required header automatically).

## Key Facts

- Maximum file size: 500 MB
- Total storage: 100 GB per organization
- Files persist until deleted
- File operations (upload, list, delete) are free; content used in messages is billed as input tokens
- Not available on Amazon Bedrock or Google Vertex AI

---

## Upload a File

```python
import anthropic

client = anthropic.Anthropic()

uploaded = client.beta.files.upload(
    file=("report.pdf", open("report.pdf", "rb"), "application/pdf"),
)
print(f"File ID: {uploaded.id}")
print(f"Size: {uploaded.size_bytes} bytes")

Use a File in Messages

PDF / Text Document

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Summarize the key findings in this report."},
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
                "title": "Q4 Report",           # optional
                "citations": {"enabled": True}   # optional, enables citations
            }
        ]
    }],
    betas=["files-api-2025-04-14"],
)
for block in response.content:
    if block.type == "text":
        print(block.text)

Image

image_file = client.beta.files.upload(
    file=("photo.png", open("photo.png", "rb"), "image/png"),
)

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {
                "type": "image",
                "source": {"type": "file", "file_id": image_file.id}
            }
        ]
    }],
    betas=["files-api-2025-04-14"],
)

Manage Files

List Files

files = client.beta.files.list()
for f in files.data:
    print(f"{f.id}: {f.filename} ({f.size_bytes} bytes)")

Get File Metadata

file_info = client.beta.files.retrieve_metadata("file_011CNha8iCJcU1wXNR6q4V8w")
print(f"Filename: {file_info.filename}")
print(f"MIME type: {file_info.mime_type}")

Delete a File

client.beta.files.delete("file_011CNha8iCJcU1wXNR6q4V8w")

Download a File

Only files created by the code execution tool or skills can be downloaded (not user-uploaded files).

file_content = client.beta.files.download("file_011CNha8iCJcU1wXNR6q4V8w")
file_content.write_to_file("output.txt")

Full End-to-End Example

Upload a document once, ask multiple questions about it:

import anthropic

client = anthropic.Anthropic()

# 1. Upload once
uploaded = client.beta.files.upload(
    file=("contract.pdf", open("contract.pdf", "rb"), "application/pdf"),
)
print(f"Uploaded: {uploaded.id}")

# 2. Ask multiple questions using the same file_id
questions = [
    "What are the key terms and conditions?",
    "What is the termination clause?",
    "Summarize the payment schedule.",
]

for question in questions:
    response = client.beta.messages.create(
        model="claude-opus-4-7",
        max_tokens=16000,
        messages=[{
            "role": "user",
            "content": [
                {"type": "text", "text": question},
                {
                    "type": "document",
                    "source": {"type": "file", "file_id": uploaded.id}
                }
            ]
        }],
        betas=["files-api-2025-04-14"],
    )
    print(f"\nQ: {question}")
    text = next((b.text for b in response.content if b.type == "text"), "")
    print(f"A: {text[:200]}")

# 3. Clean up when done
client.beta.files.delete(uploaded.id)


### python/claude-api/streaming.md

[Download python/claude-api/streaming.md](/skills/claude-api/python/claude-api/streaming.md)

```markdown
# Streaming — Python

## Quick Start

```python
with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=64000,
    messages=[{"role": "user", "content": "Write a story"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Async

async with async_client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=64000,
    messages=[{"role": "user", "content": "Write a story"}]
) as stream:
    async for text in stream.text_stream:
        print(text, end="", flush=True)

Handling Different Content Types

Claude may return text, thinking blocks, or tool use. Handle each appropriately:

Opus 4.7 / Opus 4.6: Use thinking: {type: "adaptive"}. On older models, use thinking: {type: "enabled", budget_tokens: N} instead.

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=64000,
    thinking={"type": "adaptive"},
    messages=[{"role": "user", "content": "Analyze this problem"}]
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            if event.content_block.type == "thinking":
                print("\n[Thinking...]")
            elif event.content_block.type == "text":
                print("\n[Response:]")

        elif event.type == "content_block_delta":
            if event.delta.type == "thinking_delta":
                print(event.delta.thinking, end="", flush=True)
            elif event.delta.type == "text_delta":
                print(event.delta.text, end="", flush=True)

Streaming with Tool Use

The Python tool runner currently returns complete messages. Use streaming for individual API calls within a manual loop if you need per-token streaming with tools:

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=64000,
    tools=tools,
    messages=messages
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

    response = stream.get_final_message()
    # Continue with tool execution if response.stop_reason == "tool_use"

Getting the Final Message

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=64000,
    messages=[{"role": "user", "content": "Hello"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

    # Get full message after streaming
    final_message = stream.get_final_message()
    print(f"\n\nTokens used: {final_message.usage.output_tokens}")

Streaming with Progress Updates

def stream_with_progress(client, **kwargs):
    """Stream a response with progress updates."""
    total_tokens = 0
    content_parts = []

    with client.messages.stream(**kwargs) as stream:
        for event in stream:
            if event.type == "content_block_delta":
                if event.delta.type == "text_delta":
                    text = event.delta.text
                    content_parts.append(text)
                    print(text, end="", flush=True)

            elif event.type == "message_delta":
                if event.usage and event.usage.output_tokens is not None:
                    total_tokens = event.usage.output_tokens

        final_message = stream.get_final_message()

    print(f"\n\n[Tokens used: {total_tokens}]")
    return "".join(content_parts)

Error Handling in Streams

try:
    with client.messages.stream(
        model="claude-opus-4-7",
        max_tokens=64000,
        messages=[{"role": "user", "content": "Write a story"}]
    ) as stream:
        for text in stream.text_stream:
            print(text, end="", flush=True)
except anthropic.APIConnectionError:
    print("\nConnection lost. Please retry.")
except anthropic.RateLimitError:
    print("\nRate limited. Please wait and retry.")
except anthropic.APIStatusError as e:
    print(f"\nAPI error: {e.status_code}")

Stream Event Types

Event Type	Description	When it fires
`message_start`	Contains message metadata	Once at the beginning
`content_block_start`	New content block beginning	When a text/tool_use block starts
`content_block_delta`	Incremental content update	For each token/chunk
`content_block_stop`	Content block complete	When a block finishes
`message_delta`	Message-level updates	Contains `stop_reason`, usage
`message_stop`	Message complete	Once at the end

Best Practices

Always flush output — Use flush=True to show tokens immediately
Handle partial responses — If the stream is interrupted, you may have incomplete content
Track token usage — The message_delta event contains usage information
Use timeouts — Set appropriate timeouts for your application
Default to streaming — Use .get_final_message() to get the complete response even when streaming, giving you timeout protection without needing to handle individual events


### python/claude-api/tool-use.md

[Download python/claude-api/tool-use.md](/skills/claude-api/python/claude-api/tool-use.md)

_Binary resource_

### python/managed-agents/README.md

[Download python/managed-agents/README.md](/skills/claude-api/python/managed-agents/README.md)

_Binary resource_

### ruby/claude-api.md

[Download ruby/claude-api.md](/skills/claude-api/ruby/claude-api.md)

```markdown
# Claude API — Ruby

> **Note:** The Ruby SDK supports the Claude API. A tool runner is available in beta via `client.beta.messages.tool_runner()`. Agent SDK is not yet available for Ruby.

## Installation

```bash
gem install anthropic

Client Initialization

require "anthropic"

# Default (uses ANTHROPIC_API_KEY env var)
client = Anthropic::Client.new

# Explicit API key
client = Anthropic::Client.new(api_key: "your-api-key")

Basic Message Request

message = client.messages.create(
  model: :"claude-opus-4-7",
  max_tokens: 16000,
  messages: [
    { role: "user", content: "What is the capital of France?" }
  ]
)
# content is an array of polymorphic block objects (TextBlock, ThinkingBlock,
# ToolUseBlock, ...). .type is a Symbol — compare with :text, not "text".
# .text raises NoMethodError on non-TextBlock entries.
message.content.each do |block|
  puts block.text if block.type == :text
end

Streaming

stream = client.messages.stream(
  model: :"claude-opus-4-7",
  max_tokens: 64000,
  messages: [{ role: "user", content: "Write a haiku" }]
)

stream.text.each { |text| print(text) }

Tool Use

The Ruby SDK supports tool use via raw JSON schema definitions and also provides a beta tool runner for automatic tool execution.

Tool Runner (Beta)

class GetWeatherInput < Anthropic::BaseModel
  required :location, String, doc: "City and state, e.g. San Francisco, CA"
end

class GetWeather < Anthropic::BaseTool
  doc "Get the current weather for a location"

  input_schema GetWeatherInput

  def call(input)
    "The weather in #{input.location} is sunny and 72°F."
  end
end

client.beta.messages.tool_runner(
  model: :"claude-opus-4-7",
  max_tokens: 16000,
  tools: [GetWeather.new],
  messages: [{ role: "user", content: "What's the weather in San Francisco?" }]
).each_message do |message|
  puts message.content
end

Manual Loop

See the shared tool use concepts for the tool definition format and agentic loop pattern.

Prompt Caching

system_: (trailing underscore — avoids shadowing Kernel#system) takes an array of text blocks; set cache_control on the last block. Plain hashes work via the OrHash type alias. For placement patterns and the silent-invalidator audit checklist, see shared/prompt-caching.md.

message = client.messages.create(
  model: :"claude-opus-4-7",
  max_tokens: 16000,
  system_: [
    { type: "text", text: long_system_prompt, cache_control: { type: "ephemeral" } }
  ],
  messages: [{ role: "user", content: "Summarize the key points" }]
)

For 1-hour TTL: cache_control: { type: "ephemeral", ttl: "1h" }. There's also a top-level cache_control: on messages.create that auto-places on the last cacheable block.

Verify hits via message.usage.cache_creation_input_tokens / message.usage.cache_read_input_tokens.


### ruby/managed-agents/README.md

[Download ruby/managed-agents/README.md](/skills/claude-api/ruby/managed-agents/README.md)

_Binary resource_

### shared/agent-design.md

[Download shared/agent-design.md](/skills/claude-api/shared/agent-design.md)

_Binary resource_

### shared/error-codes.md

[Download shared/error-codes.md](/skills/claude-api/shared/error-codes.md)

_Binary resource_

### shared/live-sources.md

[Download shared/live-sources.md](/skills/claude-api/shared/live-sources.md)

_Binary resource_

### shared/managed-agents-api-reference.md

[Download shared/managed-agents-api-reference.md](/skills/claude-api/shared/managed-agents-api-reference.md)

_Binary resource_

### shared/managed-agents-client-patterns.md

[Download shared/managed-agents-client-patterns.md](/skills/claude-api/shared/managed-agents-client-patterns.md)

_Binary resource_

### shared/managed-agents-core.md

[Download shared/managed-agents-core.md](/skills/claude-api/shared/managed-agents-core.md)

_Binary resource_

### shared/managed-agents-environments.md

[Download shared/managed-agents-environments.md](/skills/claude-api/shared/managed-agents-environments.md)

_Binary resource_

### shared/managed-agents-events.md

[Download shared/managed-agents-events.md](/skills/claude-api/shared/managed-agents-events.md)

_Binary resource_

### shared/managed-agents-memory.md

[Download shared/managed-agents-memory.md](/skills/claude-api/shared/managed-agents-memory.md)

_Binary resource_

### shared/managed-agents-multiagent.md

[Download shared/managed-agents-multiagent.md](/skills/claude-api/shared/managed-agents-multiagent.md)

_Binary resource_

### shared/managed-agents-onboarding.md

[Download shared/managed-agents-onboarding.md](/skills/claude-api/shared/managed-agents-onboarding.md)

_Binary resource_

### shared/managed-agents-outcomes.md

[Download shared/managed-agents-outcomes.md](/skills/claude-api/shared/managed-agents-outcomes.md)

_Binary resource_

### shared/managed-agents-overview.md

[Download shared/managed-agents-overview.md](/skills/claude-api/shared/managed-agents-overview.md)

_Binary resource_

### shared/managed-agents-tools.md

[Download shared/managed-agents-tools.md](/skills/claude-api/shared/managed-agents-tools.md)

_Binary resource_

### shared/managed-agents-webhooks.md

[Download shared/managed-agents-webhooks.md](/skills/claude-api/shared/managed-agents-webhooks.md)

```markdown
# Managed Agents — Webhooks

Anthropic can POST to your HTTPS endpoint when a Managed Agents resource changes state — an alternative to holding an SSE stream or polling. Payloads are **thin** (event type + resource IDs only); on receipt, fetch the resource for current state. Every delivery is HMAC-signed.

> **Direction matters.** This page covers *Anthropic → you* notifications about session/vault state. It does **not** cover *third-party → you* webhooks that *trigger* a session (e.g. a GitHub push handler that calls `sessions.create()`) — that's ordinary application code on your side with no Anthropic-specific wire format.

---

## Register an endpoint (Console only)

Console → **Manage → Webhooks**. There is no programmatic endpoint-management API yet. Secret rotation is supported from the same page.

| Field | Constraint |
|---|---|
| URL | HTTPS on port 443, publicly resolvable hostname |
| Event types | Subscribe per `data.type` — you only receive subscribed types (plus test events) |
| Signing secret | `whsec_`-prefixed, 32 bytes, **shown once at creation** — store it |

---

## Verify the signature

Every delivery is HMAC-signed. **Use the SDK's `client.beta.webhooks.unwrap()`** — it verifies the signature, rejects payloads more than ~5 minutes old, and returns the parsed event. It reads the `whsec_` secret from `ANTHROPIC_WEBHOOK_SIGNING_KEY`.

```python
import anthropic
from flask import Flask, request

client = anthropic.Anthropic()  # reads ANTHROPIC_WEBHOOK_SIGNING_KEY from env
app = Flask(__name__)


@app.route("/webhook", methods=["POST"])
def webhook():
    try:
        event = client.beta.webhooks.unwrap(
            request.get_data(as_text=True),
            headers=dict(request.headers),
        )
    except Exception:
        return "invalid signature", 400

    if event.id in seen_event_ids:  # dedupe retries — id is per-event, not per-delivery
        return "", 204
    seen_event_ids.add(event.id)

    match event.data.type:
        case "session.status_idled":
            session = client.beta.sessions.retrieve(event.data.id)
            notify_user(session)
        case "vault_credential.refresh_failed":
            alert_oncall(event.data.id)

    return "", 204

Pass the raw request body to unwrap() — frameworks that re-serialize JSON (Express .json(), Flask .get_json()) change the bytes and break the MAC. For other languages, look up the beta.webhooks.unwrap binding in the SDK repo (shared/live-sources.md); don't hand-roll verification.

Payload envelope

{
  "type": "event",
  "id": "event_01ABC...",
  "created_at": "2026-03-18T14:05:22Z",
  "data": {
    "type": "session.status_idled",
    "id": "session_01XYZ...",
    "organization_id": "8a3d2f1e-...",
    "workspace_id": "c7b0e4d9-..."
  }
}

Switch on data.type, fetch the resource by data.id, return any 2xx to acknowledge. created_at is when the state transition happened, not when the webhook fired.

Supported `data.type` values

`data.type`	Fires when
`session.status_scheduled`	Session created and ready to accept events
`session.status_run_started`	Agent execution kicked off (every transition to `running`)
`session.status_idled`	Agent awaiting input (tool approval, custom tool result, or next message)
`session.status_terminated`	Session hit a terminal error
`session.thread_created`	Multiagent: coordinator opened a new subagent thread
`session.thread_idled`	Multiagent: a subagent thread is waiting for input
`session.outcome_evaluation_ended`	Outcome grader finished one iteration
`vault.archived`	Vault was archived
`vault.created`	Vault was created
`vault.deleted`	Vault was deleted
`vault_credential.archived`	Vault credential was archived
`vault_credential.created`	Vault credential was created
`vault_credential.deleted`	Vault credential was deleted
`vault_credential.refresh_failed`	MCP OAuth vault credential failed to refresh

These are webhook data.type values — a separate namespace from SSE event types (session.status_idle, span.outcome_evaluation_end, etc. in shared/managed-agents-events.md). Don't reuse SSE constants in webhook handlers.

Delivery behavior & pitfalls

No ordering guarantee. session.status_idled may arrive before session.outcome_evaluation_ended even if the evaluation finished first. Sort by envelope created_at if order matters.
Retries carry the same event.id. At least one retry on non-2xx. Dedupe on event.id.
3xx is failure. Redirects are not followed — update the URL in Console if your endpoint moves.
Auto-disable after ~20 consecutive failed deliveries, or immediately if the hostname resolves to a private IP or returns a redirect. Re-enable manually in Console.
Thin payload is intentional. Don't expect stop_reason, outcome_evaluations, credential secrets, etc. on the webhook body — fetch the resource.


### shared/model-migration.md

[Download shared/model-migration.md](/skills/claude-api/shared/model-migration.md)

_Binary resource_

### shared/models.md

[Download shared/models.md](/skills/claude-api/shared/models.md)

_Binary resource_

### shared/prompt-caching.md

[Download shared/prompt-caching.md](/skills/claude-api/shared/prompt-caching.md)

_Binary resource_

### shared/tool-use-concepts.md

[Download shared/tool-use-concepts.md](/skills/claude-api/shared/tool-use-concepts.md)

_Binary resource_

### typescript/claude-api/README.md

[Download typescript/claude-api/README.md](/skills/claude-api/typescript/claude-api/README.md)

_Binary resource_

### typescript/claude-api/batches.md

[Download typescript/claude-api/batches.md](/skills/claude-api/typescript/claude-api/batches.md)

```markdown
# Message Batches API — TypeScript

The Batches API (`POST /v1/messages/batches`) processes Messages API requests asynchronously at 50% of standard prices.

## Key Facts

- Up to 100,000 requests or 256 MB per batch
- Most batches complete within 1 hour; maximum 24 hours
- Results available for 29 days after creation
- 50% cost reduction on all token usage
- All Messages API features supported (vision, tools, caching, etc.)

---

## Create a Batch

```typescript
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const messageBatch = await client.messages.batches.create({
  requests: [
    {
      custom_id: "request-1",
      params: {
        model: "claude-opus-4-7",
        max_tokens: 16000,
        messages: [
          { role: "user", content: "Summarize climate change impacts" },
        ],
      },
    },
    {
      custom_id: "request-2",
      params: {
        model: "claude-opus-4-7",
        max_tokens: 16000,
        messages: [
          { role: "user", content: "Explain quantum computing basics" },
        ],
      },
    },
  ],
});

console.log(`Batch ID: ${messageBatch.id}`);
console.log(`Status: ${messageBatch.processing_status}`);

Poll for Completion

let batch;
while (true) {
  batch = await client.messages.batches.retrieve(messageBatch.id);
  if (batch.processing_status === "ended") break;
  console.log(
    `Status: ${batch.processing_status}, processing: ${batch.request_counts.processing}`,
  );
  await new Promise((resolve) => setTimeout(resolve, 60_000));
}

console.log("Batch complete!");
console.log(`Succeeded: ${batch.request_counts.succeeded}`);
console.log(`Errored: ${batch.request_counts.errored}`);

Retrieve Results

for await (const result of await client.messages.batches.results(
  messageBatch.id,
)) {
  switch (result.result.type) {
    case "succeeded":
      console.log(
        `[${result.custom_id}] ${result.result.message.content[0].text.slice(0, 100)}`,
      );
      break;
    case "errored":
      if (result.result.error.type === "invalid_request") {
        console.log(`[${result.custom_id}] Validation error - fix and retry`);
      } else {
        console.log(`[${result.custom_id}] Server error - safe to retry`);
      }
      break;
    case "expired":
      console.log(`[${result.custom_id}] Expired - resubmit`);
      break;
  }
}

Cancel a Batch

const cancelled = await client.messages.batches.cancel(messageBatch.id);
console.log(`Status: ${cancelled.processing_status}`); // "canceling"


### typescript/claude-api/files-api.md

[Download typescript/claude-api/files-api.md](/skills/claude-api/typescript/claude-api/files-api.md)

```markdown
# Files API — TypeScript

The Files API uploads files for use in Messages API requests. Reference files via `file_id` in content blocks, avoiding re-uploads across multiple API calls.

**Beta:** Pass `betas: ["files-api-2025-04-14"]` in your API calls (the SDK sets the required header automatically).

## Key Facts

- Maximum file size: 500 MB
- Total storage: 100 GB per organization
- Files persist until deleted
- File operations (upload, list, delete) are free; content used in messages is billed as input tokens
- Not available on Amazon Bedrock or Google Vertex AI

---

## Upload a File

```typescript
import Anthropic, { toFile } from "@anthropic-ai/sdk";
import fs from "fs";

const client = new Anthropic();

const uploaded = await client.beta.files.upload({
  file: await toFile(fs.createReadStream("report.pdf"), undefined, {
    type: "application/pdf",
  }),
  betas: ["files-api-2025-04-14"],
});

console.log(`File ID: ${uploaded.id}`);
console.log(`Size: ${uploaded.size_bytes} bytes`);

Use a File in Messages

PDF / Text Document

const response = await client.beta.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 16000,
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Summarize the key findings in this report." },
        {
          type: "document",
          source: { type: "file", file_id: uploaded.id },
          title: "Q4 Report",
          citations: { enabled: true },
        },
      ],
    },
  ],
  betas: ["files-api-2025-04-14"],
});

console.log(response.content[0].text);

Manage Files

List Files

const files = await client.beta.files.list({
  betas: ["files-api-2025-04-14"],
});
for (const f of files.data) {
  console.log(`${f.id}: ${f.filename} (${f.size_bytes} bytes)`);
}

Delete a File

await client.beta.files.delete("file_011CNha8iCJcU1wXNR6q4V8w", {
  betas: ["files-api-2025-04-14"],
});

Download a File

const response = await client.beta.files.download(
  "file_011CNha8iCJcU1wXNR6q4V8w",
  { betas: ["files-api-2025-04-14"] },
);
const content = Buffer.from(await response.arrayBuffer());
await fs.promises.writeFile("output.txt", content);


### typescript/claude-api/streaming.md

[Download typescript/claude-api/streaming.md](/skills/claude-api/typescript/claude-api/streaming.md)

```markdown
# Streaming — TypeScript

## Quick Start

```typescript
const stream = client.messages.stream({
  model: "claude-opus-4-7",
  max_tokens: 64000,
  messages: [{ role: "user", content: "Write a story" }],
});

for await (const event of stream) {
  if (
    event.type === "content_block_delta" &&
    event.delta.type === "text_delta"
  ) {
    process.stdout.write(event.delta.text);
  }
}

Handling Different Content Types

Opus 4.7 / Opus 4.6: Use thinking: {type: "adaptive"}. On older models, use thinking: {type: "enabled", budget_tokens: N} instead.

const stream = client.messages.stream({
  model: "claude-opus-4-7",
  max_tokens: 64000,
  thinking: { type: "adaptive" },
  messages: [{ role: "user", content: "Analyze this problem" }],
});

for await (const event of stream) {
  switch (event.type) {
    case "content_block_start":
      switch (event.content_block.type) {
        case "thinking":
          console.log("\n[Thinking...]");
          break;
        case "text":
          console.log("\n[Response:]");
          break;
      }
      break;
    case "content_block_delta":
      switch (event.delta.type) {
        case "thinking_delta":
          process.stdout.write(event.delta.thinking);
          break;
        case "text_delta":
          process.stdout.write(event.delta.text);
          break;
      }
      break;
  }
}

Streaming with Tool Use (Tool Runner)

Use the tool runner with stream: true. The outer loop iterates over tool runner iterations (messages), the inner loop processes stream events:

import Anthropic from "@anthropic-ai/sdk";
import { betaZodTool } from "@anthropic-ai/sdk/helpers/beta/zod";
import { z } from "zod";

const client = new Anthropic();

const getWeather = betaZodTool({
  name: "get_weather",
  description: "Get current weather for a location",
  inputSchema: z.object({
    location: z.string().describe("City and state, e.g., San Francisco, CA"),
  }),
  run: async ({ location }) => `72°F and sunny in ${location}`,
});

const runner = client.beta.messages.toolRunner({
  model: "claude-opus-4-7",
  max_tokens: 64000,
  tools: [getWeather],
  messages: [
    { role: "user", content: "What's the weather in Paris and London?" },
  ],
  stream: true,
});

// Outer loop: each tool runner iteration
for await (const messageStream of runner) {
  // Inner loop: stream events for this iteration
  for await (const event of messageStream) {
    switch (event.type) {
      case "content_block_delta":
        switch (event.delta.type) {
          case "text_delta":
            process.stdout.write(event.delta.text);
            break;
          case "input_json_delta":
            // Tool input being streamed
            break;
        }
        break;
    }
  }
}

Getting the Final Message

const stream = client.messages.stream({
  model: "claude-opus-4-7",
  max_tokens: 64000,
  messages: [{ role: "user", content: "Hello" }],
});

for await (const event of stream) {
  // Process events...
}

const finalMessage = await stream.finalMessage();
console.log(`Tokens used: ${finalMessage.usage.output_tokens}`);

Stream Event Types

Event Type	Description	When it fires
`message_start`	Contains message metadata	Once at the beginning
`content_block_start`	New content block beginning	When a text/tool_use block starts
`content_block_delta`	Incremental content update	For each token/chunk
`content_block_stop`	Content block complete	When a block finishes
`message_delta`	Message-level updates	Contains `stop_reason`, usage
`message_stop`	Message complete	Once at the end

Best Practices

Always flush output — Use process.stdout.write() for immediate display
Handle partial responses — If the stream is interrupted, you may have incomplete content
Track token usage — The message_delta event contains usage information
Use finalMessage() — Get the complete Anthropic.Message object even when streaming. Don't wrap .on() events in new Promise() — finalMessage() handles all completion/error/abort states internally
Buffer for web UIs — Consider buffering a few tokens before rendering to avoid excessive DOM updates
Use stream.on("text", ...) for deltas — The text event provides just the delta string, simpler than manually filtering content_block_delta events
For agentic loops with streaming — See the Streaming Manual Loop section in tool-use.md for combining stream() + finalMessage() with a tool-use loop

Raw SSE Format

If using raw HTTP (not SDKs), the stream returns Server-Sent Events:

event: message_start
data: {"type":"message_start","message":{"id":"msg_...","type":"message",...}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":12}}

event: message_stop
data: {"type":"message_stop"}


### typescript/claude-api/tool-use.md

[Download typescript/claude-api/tool-use.md](/skills/claude-api/typescript/claude-api/tool-use.md)

_Binary resource_

### typescript/managed-agents/README.md

[Download typescript/managed-agents/README.md](/skills/claude-api/typescript/managed-agents/README.md)

_Binary resource_

## See in GitHub

[See in GitHub](https://github.com/anthropics/skills/tree/main/claude-api)

Source: Content adapted from anthropics/skills (MIT).

This skill helps you build LLM-powered applications with Claude. Choose the right surface based on your needs, detect the project language, then read the relevant language-specific documentation.

Before You Start

Output Requirement

When the user asks you to add, modify, or implement a Claude feature, your code must call Claude through one of:

The official Anthropic SDK for the project's language (anthropic, @anthropic-ai/sdk, com.anthropic.*, etc.). This is the default whenever a supported SDK exists for the project.
Raw HTTP (curl, requests, fetch, httpx, etc.) - only when the user explicitly asks for cURL/REST/raw HTTP, the project is a shell/cURL project, or the language has no official SDK.

Never mix the two - don't reach for requests/fetch in a Python or TypeScript project just because it feels lighter. Never fall back to OpenAI-compatible shims.

Defaults

Unless the user requests otherwise:

Subcommands

Language Detection

Before reading code examples, determine which language the user is working in:

Look at project files to infer the language:
- *.py, requirements.txt, pyproject.toml, setup.py, Pipfile -> Python - read from python/
- *.ts, *.tsx, package.json, tsconfig.json -> TypeScript - read from typescript/
- *.js, *.jsx (no .ts files present) -> TypeScript - JS uses the same SDK, read from typescript/
- *.java, pom.xml, build.gradle -> Java - read from java/
- *.kt, *.kts, build.gradle.kts -> Java - Kotlin uses the Java SDK, read from java/
- *.scala, build.sbt -> Java - Scala uses the Java SDK, read from java/
- *.go, go.mod -> Go - read from go/
- *.rb, Gemfile -> Ruby - read from ruby/
- *.cs, *.csproj -> C# - read from csharp/
- *.php, composer.json -> PHP - read from php/
If multiple languages detected (e.g., both Python and TypeScript files):
- Check which language the user's current file or question relates to
- If still ambiguous, ask: "I detected both Python and TypeScript files. Which language are you using for the Claude API integration?"
If language can't be inferred (empty project, no source files, or unsupported language):
- Use AskUserQuestion with options: Python, TypeScript, Java, Go, Ruby, cURL/raw HTTP, C#, PHP
- If AskUserQuestion is unavailable, default to Python examples and note: "Showing Python examples. Let me know if you need a different language."
If unsupported language detected (Rust, Swift, C++, Elixir, etc.):
- Suggest cURL/raw HTTP examples from curl/ and note that community SDKs may exist
- Offer to show Python or TypeScript examples as reference implementations
If user needs cURL/raw HTTP examples, read from curl/.

Language-Specific Feature Support

Language	Tool Runner	Managed Agents	Notes
Python	Yes (beta)	Yes (beta)	Full support - `@beta_tool` decorator
TypeScript	Yes (beta)	Yes (beta)	Full support - `betaZodTool` + Zod
Java	Yes (beta)	Yes (beta)	Beta tool use with annotated classes
Go	Yes (beta)	Yes (beta)	`BetaToolRunner` in `toolrunner` pkg
Ruby	Yes (beta)	Yes (beta)	`BaseTool` + `tool_runner` in beta
C#	No	No	Official SDK
PHP	Yes (beta)	Yes (beta)	`BetaRunnableTool` + `toolRunner()`
cURL	N/A	Yes (beta)	Raw HTTP, no SDK features

Which Surface Should I Use?

Use Case	Tier	Recommended Surface	Why
Classification, summarization, extraction, Q&A	Single LLM call	Claude API	One request, one response
Batch processing or embeddings	Single LLM call	Claude API	Specialized endpoints
Multi-step pipelines with code-controlled logic	Workflow	Claude API + tool use	You orchestrate the loop
Custom agent with your own tools	Agent	Claude API + tool use	Maximum flexibility
Server-managed stateful agent with workspace	Agent	Managed Agents	Anthropic runs the loop and hosts the tool-execution sandbox
Persisted, versioned agent configs	Agent	Managed Agents	Agents are stored objects; sessions pin to a version
Long-running multi-turn agent with file mounts	Agent	Managed Agents	Per-session containers, SSE event stream, Skills + MCP

Decision Tree

What does your application need?

0. Are you deploying through Amazon Bedrock, Google Vertex AI, or Microsoft Foundry?
    Yes -> Claude API (+ tool use for agents) - Managed Agents is 1P only.
   No -> continue.

1. Single LLM call (classification, summarization, extraction, Q&A)
    Claude API - one request, one response

2. Do you want Anthropic to run the agent loop and host a per-session
   container where Claude executes tools (bash, file ops, code)?
    Yes -> Managed Agents - server-managed sessions, persisted agent configs,
       SSE event stream, Skills + MCP, file mounts.
       Examples: "stateful coding agent with a workspace per task",
                 "long-running research agent that streams events to a UI",
                 "agent with persisted, versioned config used across many sessions"

3. Workflow (multi-step, code-orchestrated, with your own tools)
    Claude API with tool use - you control the loop

4. Open-ended agent (model decides its own trajectory, your own tools, you host the compute)
    Claude API agentic loop (maximum flexibility)

Should I Build an Agent?

Before choosing the agent tier, check all four criteria:

Complexity - Is the task multi-step and hard to fully specify in advance? (e.g., "turn this design doc into a PR" vs. "extract the title from this PDF")
Value - Does the outcome justify higher cost and latency?
Viability - Is Claude capable at this task type?
Cost of error - Can errors be caught and recovered from? (tests, review, rollback)

If the answer is "no" to any of these, stay at a simpler tier (single call or workflow).

Architecture

Everything goes through POST /v1/messages. Tools and output constraints are features of this single endpoint - not separate APIs.

Current Models (cached: 2026-04-15)

Model	Model ID	Context	Input $/1M	Output $/1M
Claude Opus 4.7	`claude-opus-4-7`	1M	$5.00	$25.00
Claude Opus 4.6	`claude-opus-4-6`	1M	$5.00	$25.00
Claude Sonnet 4.6	`claude-sonnet-4-6`	1M	$3.00	$15.00
Claude Haiku 4.5	`claude-haiku-4-5`	200K	$1.00	$5.00

Thinking & Effort (Quick Reference)

Sonnet 4.6: Supports adaptive thinking (thinking: {type: "adaptive"}). budget_tokens is deprecated on Sonnet 4.6 - use adaptive thinking instead.

Compaction (Quick Reference)

See {lang}/claude-api/README.md (Compaction section) for code examples. Full docs via WebFetch in shared/live-sources.md.

Prompt Caching (Quick Reference)

Verify with usage.cache_read_input_tokens - if it's zero across repeated requests, a silent invalidator is at work (datetime.now() in system prompt, unsorted JSON, varying tool set).

Managed Agents (Beta)

Managed Agents is first-party only. It is not available on Amazon Bedrock, Google Vertex AI, or Microsoft Foundry. For agents on third-party providers, use Claude API + tool use.

Subcommands - invoke directly with /claude-api <subcommand>:

Subcommand	Action
`managed-agents-onboard`	Walk the user through setting up a Managed Agent from scratch. Read `shared/managed-agents-onboarding.md` immediately and follow its interview script: mental model -> know-or-explore branch -> template config -> session setup -> emit code. Do not summarize - run the interview.

Reading Guide

After detecting the language, read the relevant files based on what the user needs:

Quick Task Reference

Single text classification/summarization/extraction/Q&A: -> Read only {lang}/claude-api/README.md

Chat UI or real-time response display: -> Read {lang}/claude-api/README.md + {lang}/claude-api/streaming.md

Function calling / tool use / agents: -> Read {lang}/claude-api/README.md + shared/tool-use-concepts.md + {lang}/claude-api/tool-use.md

Agent design (tool surface, context management, caching strategy): -> Read shared/agent-design.md

Batch processing (non-latency-sensitive): -> Read {lang}/claude-api/README.md + {lang}/claude-api/batches.md

File uploads across multiple requests: -> Read {lang}/claude-api/README.md + {lang}/claude-api/files-api.md

Claude API (Full File Reference)

Read the language-specific Claude API folder ({language}/claude-api/):

{language}/claude-api/README.md - Read this first. Installation, quick start, common patterns, error handling.
shared/tool-use-concepts.md - Read when the user needs function calling, code execution, memory, or structured outputs. Covers conceptual foundations.
shared/agent-design.md - Read when designing an agent: bash vs. dedicated tools, programmatic tool calling, tool search/skills, context editing vs. compaction vs. memory, caching principles.
{language}/claude-api/tool-use.md - Read for language-specific tool use code examples (tool runner, manual loop, code execution, memory, structured outputs).
{language}/claude-api/streaming.md - Read when building chat UIs or interfaces that display responses incrementally.
{language}/claude-api/batches.md - Read when processing many requests offline (not latency-sensitive). Runs asynchronously at 50% cost.
{language}/claude-api/files-api.md - Read when sending the same file across multiple requests without re-uploading.
shared/prompt-caching.md - Read when adding or optimizing prompt caching. Covers prefix-stability design, breakpoint placement, and anti-patterns that silently invalidate cache.
shared/error-codes.md - Read when debugging HTTP errors or implementing error handling.
shared/model-migration.md - Read when upgrading to newer models, replacing retired models, or translating budget_tokens / prefill patterns to the current API.
shared/live-sources.md - WebFetch URLs for fetching the latest official documentation.

Note: For Java, Go, Ruby, C#, PHP, and cURL - these have a single file each covering all basics. Read that file plus shared/tool-use-concepts.md and shared/error-codes.md as needed.

Note: For the Managed Agents file reference, see the ## Managed Agents (Beta) section above - it lists every shared/managed-agents-*.md file and the language-specific READMEs.

When to Use WebFetch

Use WebFetch to get the latest documentation when:

User asks for "latest" or "current" information
Cached data seems incorrect
User asks about features not covered here

Live documentation URLs are in shared/live-sources.md.

Common Pitfalls

Don't truncate inputs when passing files or content to the API. If the content is too long to fit in the context window, notify the user and discuss options (chunking, summarization, etc.) rather than silently truncating.
Opus 4.7 thinking: Adaptive only. thinking: {type: "enabled", budget_tokens: N} returns 400 on Opus 4.7 - budget_tokens is fully removed there (along with temperature, top_p, top_k). Use thinking: {type: "adaptive"}.
Opus 4.6 / Sonnet 4.6 thinking: Use thinking: {type: "adaptive"} - do NOT use budget_tokens for new 4.6 code (deprecated on both Opus 4.6 and Sonnet 4.6; for gradual migration of existing code, see the transitional escape hatch in shared/model-migration.md - note this carve-out does not apply to Opus 4.7). For older models, budget_tokens must be less than max_tokens (minimum 1024). This will throw an error if you get it wrong.
4.6/4.7 family prefill removed: Assistant message prefills (last-assistant-turn prefills) return a 400 error on Opus 4.6, Opus 4.7, and Sonnet 4.6. Use structured outputs (output_config.format) or system prompt instructions to control response format instead.
Confirm migration scope before editing: When a user asks to migrate code to a newer Claude model without naming a specific file, directory, or file list, ask which scope to apply first - the entire working directory, a specific subdirectory, or a specific set of files. Do not start editing until the user confirms. Imperative phrasings like "migrate my codebase", "move my project to X", "upgrade to Sonnet 4.6", or bare "migrate to Opus 4.7" are still ambiguous - they tell you what to do but not where, so ask. Proceed without asking only when the prompt names an exact file, a specific directory, or an explicit file list ("migrate app.py", "migrate everything under services/", "update a.py and b.py"). See shared/model-migration.md Step 0.
max_tokens defaults: Don't lowball max_tokens - hitting the cap truncates output mid-thought and requires a retry. For non-streaming requests, default to ~16000 (keeps responses under SDK HTTP timeouts). For streaming requests, default to ~64000 (timeouts aren't a concern, so give the model room). Only go lower when you have a hard reason: classification (~256), cost caps, or deliberately short outputs.
128K output tokens: Opus 4.6 and Opus 4.7 support up to 128K max_tokens, but the SDKs require streaming for values that large to avoid HTTP timeouts. Use .stream() with .get_final_message() / .finalMessage().
Tool call JSON parsing (4.6/4.7 family): Opus 4.6, Opus 4.7, and Sonnet 4.6 may produce different JSON string escaping in tool call input fields (e.g., Unicode or forward-slash escaping). Always parse tool inputs with json.loads() / JSON.parse() - never do raw string matching on the serialized input.
Structured outputs (all models): Use output_config: {format: {...}} instead of the deprecated output_format parameter on messages.create(). This is a general API change, not 4.6-specific.
Don't reimplement SDK functionality: The SDK provides high-level helpers - use them instead of building from scratch. Specifically: use stream.finalMessage() instead of wrapping .on() events in new Promise(); use typed exception classes (Anthropic.RateLimitError, etc.) instead of string-matching error messages; use SDK types (Anthropic.MessageParam, Anthropic.Tool, Anthropic.Message, etc.) instead of redefining equivalent interfaces.
Don't define custom types for SDK data structures: The SDK exports types for all API objects. Use Anthropic.MessageParam for messages, Anthropic.Tool for tool definitions, Anthropic.ToolUseBlock / Anthropic.ToolResultBlockParam for tool results, Anthropic.Message for responses. Defining your own interface ChatMessage { role: string; content: unknown } duplicates what the SDK already provides and loses type safety.
Report and document output: For tasks that produce reports, documents, or visualizations, the code execution sandbox has python-docx, python-pptx, matplotlib, pillow, and pypdf pre-installed. Claude can generate formatted files (DOCX, PDF, charts) and return them via the Files API - consider this for "report" or "document" type requests instead of plain stdout text.

Resource Files

LICENSE.txt

Download LICENSE.txt

Binary resource

csharp/claude-api.md

Download csharp/claude-api.md

Binary resource

curl/examples.md

Download curl/examples.md

Binary resource

curl/managed-agents.md

Download curl/managed-agents.md

Binary resource

go/claude-api.md

Download go/claude-api.md

Binary resource

go/managed-agents/README.md

Download go/managed-agents/README.md

Binary resource

java/claude-api.md

Download java/claude-api.md

Binary resource

java/managed-agents/README.md

Download java/managed-agents/README.md

Binary resource

php/claude-api.md

Download php/claude-api.md

Binary resource

php/managed-agents/README.md

Download php/managed-agents/README.md

Binary resource

python/claude-api/README.md

Download python/claude-api/README.md

Binary resource

python/claude-api/batches.md

Download python/claude-api/batches.md

# Message Batches API — Python

The Batches API (`POST /v1/messages/batches`) processes Messages API requests asynchronously at 50% of standard prices.

## Key Facts

- Up to 100,000 requests or 256 MB per batch
- Most batches complete within 1 hour; maximum 24 hours
- Results available for 29 days after creation
- 50% cost reduction on all token usage
- All Messages API features supported (vision, tools, caching, etc.)

---

## Create a Batch

```python
import anthropic
from anthropic.types.message_create_params import MessageCreateParamsNonStreaming
from anthropic.types.messages.batch_create_params import Request

client = anthropic.Anthropic()

message_batch = client.messages.batches.create(
    requests=[
        Request(
            custom_id="request-1",
            params=MessageCreateParamsNonStreaming(
                model="claude-opus-4-7",
                max_tokens=16000,
                messages=[{"role": "user", "content": "Summarize climate change impacts"}]
            )
        ),
        Request(
            custom_id="request-2",
            params=MessageCreateParamsNonStreaming(
                model="claude-opus-4-7",
                max_tokens=16000,
                messages=[{"role": "user", "content": "Explain quantum computing basics"}]
            )
        ),
    ]
)

print(f"Batch ID: {message_batch.id}")
print(f"Status: {message_batch.processing_status}")

Poll for Completion

import time

while True:
    batch = client.messages.batches.retrieve(message_batch.id)
    if batch.processing_status == "ended":
        break
    print(f"Status: {batch.processing_status}, processing: {batch.request_counts.processing}")
    time.sleep(60)

print("Batch complete!")
print(f"Succeeded: {batch.request_counts.succeeded}")
print(f"Errored: {batch.request_counts.errored}")

Retrieve Results

Note: Examples below use match/case syntax, requiring Python 3.10+. For earlier versions, use if/elif chains instead.

for result in client.messages.batches.results(message_batch.id):
    match result.result.type:
        case "succeeded":
            msg = result.result.message
            text = next((b.text for b in msg.content if b.type == "text"), "")
            print(f"[{result.custom_id}] {text[:100]}")
        case "errored":
            if result.result.error.type == "invalid_request":
                print(f"[{result.custom_id}] Validation error - fix request and retry")
            else:
                print(f"[{result.custom_id}] Server error - safe to retry")
        case "canceled":
            print(f"[{result.custom_id}] Canceled")
        case "expired":
            print(f"[{result.custom_id}] Expired - resubmit")

Cancel a Batch

cancelled = client.messages.batches.cancel(message_batch.id)
print(f"Status: {cancelled.processing_status}")  # "canceling"

Batch with Prompt Caching

shared_system = [
    {"type": "text", "text": "You are a literary analyst."},
    {
        "type": "text",
        "text": large_document_text,  # Shared across all requests
        "cache_control": {"type": "ephemeral"}
    }
]

message_batch = client.messages.batches.create(
    requests=[
        Request(
            custom_id=f"analysis-{i}",
            params=MessageCreateParamsNonStreaming(
                model="claude-opus-4-7",
                max_tokens=16000,
                system=shared_system,
                messages=[{"role": "user", "content": question}]
            )
        )
        for i, question in enumerate(questions)
    ]
)

Full End-to-End Example

import anthropic
import time
from anthropic.types.message_create_params import MessageCreateParamsNonStreaming
from anthropic.types.messages.batch_create_params import Request

client = anthropic.Anthropic()

# 1. Prepare requests
items_to_classify = [
    "The product quality is excellent!",
    "Terrible customer service, never again.",
    "It's okay, nothing special.",
]

requests = [
    Request(
        custom_id=f"classify-{i}",
        params=MessageCreateParamsNonStreaming(
            model="claude-haiku-4-5",
            max_tokens=50,
            messages=[{
                "role": "user",
                "content": f"Classify as positive/negative/neutral (one word): {text}"
            }]
        )
    )
    for i, text in enumerate(items_to_classify)
]

# 2. Create batch
batch = client.messages.batches.create(requests=requests)
print(f"Created batch: {batch.id}")

# 3. Wait for completion
while True:
    batch = client.messages.batches.retrieve(batch.id)
    if batch.processing_status == "ended":
        break
    time.sleep(10)

# 4. Collect results
results = {}
for result in client.messages.batches.results(batch.id):
    if result.result.type == "succeeded":
        msg = result.result.message
        results[result.custom_id] = next((b.text for b in msg.content if b.type == "text"), "")

for custom_id, classification in sorted(results.items()):
    print(f"{custom_id}: {classification}")


### python/claude-api/files-api.md

[Download python/claude-api/files-api.md](/skills/claude-api/python/claude-api/files-api.md)

```markdown
# Files API — Python

The Files API uploads files for use in Messages API requests. Reference files via `file_id` in content blocks, avoiding re-uploads across multiple API calls.

**Beta:** Pass `betas=["files-api-2025-04-14"]` in your API calls (the SDK sets the required header automatically).

## Key Facts

- Maximum file size: 500 MB
- Total storage: 100 GB per organization
- Files persist until deleted
- File operations (upload, list, delete) are free; content used in messages is billed as input tokens
- Not available on Amazon Bedrock or Google Vertex AI

---

## Upload a File

```python
import anthropic

client = anthropic.Anthropic()

uploaded = client.beta.files.upload(
    file=("report.pdf", open("report.pdf", "rb"), "application/pdf"),
)
print(f"File ID: {uploaded.id}")
print(f"Size: {uploaded.size_bytes} bytes")

Use a File in Messages

PDF / Text Document

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Summarize the key findings in this report."},
            {
                "type": "document",
                "source": {"type": "file", "file_id": uploaded.id},
                "title": "Q4 Report",           # optional
                "citations": {"enabled": True}   # optional, enables citations
            }
        ]
    }],
    betas=["files-api-2025-04-14"],
)
for block in response.content:
    if block.type == "text":
        print(block.text)

Image

image_file = client.beta.files.upload(
    file=("photo.png", open("photo.png", "rb"), "image/png"),
)

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {
                "type": "image",
                "source": {"type": "file", "file_id": image_file.id}
            }
        ]
    }],
    betas=["files-api-2025-04-14"],
)

Manage Files

List Files

files = client.beta.files.list()
for f in files.data:
    print(f"{f.id}: {f.filename} ({f.size_bytes} bytes)")

Get File Metadata

file_info = client.beta.files.retrieve_metadata("file_011CNha8iCJcU1wXNR6q4V8w")
print(f"Filename: {file_info.filename}")
print(f"MIME type: {file_info.mime_type}")

Delete a File

client.beta.files.delete("file_011CNha8iCJcU1wXNR6q4V8w")

Download a File

Only files created by the code execution tool or skills can be downloaded (not user-uploaded files).

file_content = client.beta.files.download("file_011CNha8iCJcU1wXNR6q4V8w")
file_content.write_to_file("output.txt")

Full End-to-End Example

Upload a document once, ask multiple questions about it:

import anthropic

client = anthropic.Anthropic()

# 1. Upload once
uploaded = client.beta.files.upload(
    file=("contract.pdf", open("contract.pdf", "rb"), "application/pdf"),
)
print(f"Uploaded: {uploaded.id}")

# 2. Ask multiple questions using the same file_id
questions = [
    "What are the key terms and conditions?",
    "What is the termination clause?",
    "Summarize the payment schedule.",
]

for question in questions:
    response = client.beta.messages.create(
        model="claude-opus-4-7",
        max_tokens=16000,
        messages=[{
            "role": "user",
            "content": [
                {"type": "text", "text": question},
                {
                    "type": "document",
                    "source": {"type": "file", "file_id": uploaded.id}
                }
            ]
        }],
        betas=["files-api-2025-04-14"],
    )
    print(f"\nQ: {question}")
    text = next((b.text for b in response.content if b.type == "text"), "")
    print(f"A: {text[:200]}")

# 3. Clean up when done
client.beta.files.delete(uploaded.id)


### python/claude-api/streaming.md

[Download python/claude-api/streaming.md](/skills/claude-api/python/claude-api/streaming.md)

```markdown
# Streaming — Python

## Quick Start

```python
with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=64000,
    messages=[{"role": "user", "content": "Write a story"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Async

async with async_client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=64000,
    messages=[{"role": "user", "content": "Write a story"}]
) as stream:
    async for text in stream.text_stream:
        print(text, end="", flush=True)

Handling Different Content Types

Claude may return text, thinking blocks, or tool use. Handle each appropriately:

Opus 4.7 / Opus 4.6: Use thinking: {type: "adaptive"}. On older models, use thinking: {type: "enabled", budget_tokens: N} instead.

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=64000,
    thinking={"type": "adaptive"},
    messages=[{"role": "user", "content": "Analyze this problem"}]
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            if event.content_block.type == "thinking":
                print("\n[Thinking...]")
            elif event.content_block.type == "text":
                print("\n[Response:]")

        elif event.type == "content_block_delta":
            if event.delta.type == "thinking_delta":
                print(event.delta.thinking, end="", flush=True)
            elif event.delta.type == "text_delta":
                print(event.delta.text, end="", flush=True)

Streaming with Tool Use

The Python tool runner currently returns complete messages. Use streaming for individual API calls within a manual loop if you need per-token streaming with tools:

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=64000,
    tools=tools,
    messages=messages
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

    response = stream.get_final_message()
    # Continue with tool execution if response.stop_reason == "tool_use"

Getting the Final Message

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=64000,
    messages=[{"role": "user", "content": "Hello"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

    # Get full message after streaming
    final_message = stream.get_final_message()
    print(f"\n\nTokens used: {final_message.usage.output_tokens}")

Streaming with Progress Updates

def stream_with_progress(client, **kwargs):
    """Stream a response with progress updates."""
    total_tokens = 0
    content_parts = []

    with client.messages.stream(**kwargs) as stream:
        for event in stream:
            if event.type == "content_block_delta":
                if event.delta.type == "text_delta":
                    text = event.delta.text
                    content_parts.append(text)
                    print(text, end="", flush=True)

            elif event.type == "message_delta":
                if event.usage and event.usage.output_tokens is not None:
                    total_tokens = event.usage.output_tokens

        final_message = stream.get_final_message()

    print(f"\n\n[Tokens used: {total_tokens}]")
    return "".join(content_parts)

Error Handling in Streams

try:
    with client.messages.stream(
        model="claude-opus-4-7",
        max_tokens=64000,
        messages=[{"role": "user", "content": "Write a story"}]
    ) as stream:
        for text in stream.text_stream:
            print(text, end="", flush=True)
except anthropic.APIConnectionError:
    print("\nConnection lost. Please retry.")
except anthropic.RateLimitError:
    print("\nRate limited. Please wait and retry.")
except anthropic.APIStatusError as e:
    print(f"\nAPI error: {e.status_code}")

Stream Event Types

Event Type	Description	When it fires
`message_start`	Contains message metadata	Once at the beginning
`content_block_start`	New content block beginning	When a text/tool_use block starts
`content_block_delta`	Incremental content update	For each token/chunk
`content_block_stop`	Content block complete	When a block finishes
`message_delta`	Message-level updates	Contains `stop_reason`, usage
`message_stop`	Message complete	Once at the end

Best Practices

Always flush output — Use flush=True to show tokens immediately
Handle partial responses — If the stream is interrupted, you may have incomplete content
Track token usage — The message_delta event contains usage information
Use timeouts — Set appropriate timeouts for your application
Default to streaming — Use .get_final_message() to get the complete response even when streaming, giving you timeout protection without needing to handle individual events


### python/claude-api/tool-use.md

[Download python/claude-api/tool-use.md](/skills/claude-api/python/claude-api/tool-use.md)

_Binary resource_

### python/managed-agents/README.md

[Download python/managed-agents/README.md](/skills/claude-api/python/managed-agents/README.md)

_Binary resource_

### ruby/claude-api.md

[Download ruby/claude-api.md](/skills/claude-api/ruby/claude-api.md)

```markdown
# Claude API — Ruby

> **Note:** The Ruby SDK supports the Claude API. A tool runner is available in beta via `client.beta.messages.tool_runner()`. Agent SDK is not yet available for Ruby.

## Installation

```bash
gem install anthropic

Client Initialization

require "anthropic"

# Default (uses ANTHROPIC_API_KEY env var)
client = Anthropic::Client.new

# Explicit API key
client = Anthropic::Client.new(api_key: "your-api-key")

Basic Message Request

message = client.messages.create(
  model: :"claude-opus-4-7",
  max_tokens: 16000,
  messages: [
    { role: "user", content: "What is the capital of France?" }
  ]
)
# content is an array of polymorphic block objects (TextBlock, ThinkingBlock,
# ToolUseBlock, ...). .type is a Symbol — compare with :text, not "text".
# .text raises NoMethodError on non-TextBlock entries.
message.content.each do |block|
  puts block.text if block.type == :text
end

Streaming

stream = client.messages.stream(
  model: :"claude-opus-4-7",
  max_tokens: 64000,
  messages: [{ role: "user", content: "Write a haiku" }]
)

stream.text.each { |text| print(text) }

Tool Use

The Ruby SDK supports tool use via raw JSON schema definitions and also provides a beta tool runner for automatic tool execution.

Tool Runner (Beta)

class GetWeatherInput < Anthropic::BaseModel
  required :location, String, doc: "City and state, e.g. San Francisco, CA"
end

class GetWeather < Anthropic::BaseTool
  doc "Get the current weather for a location"

  input_schema GetWeatherInput

  def call(input)
    "The weather in #{input.location} is sunny and 72°F."
  end
end

client.beta.messages.tool_runner(
  model: :"claude-opus-4-7",
  max_tokens: 16000,
  tools: [GetWeather.new],
  messages: [{ role: "user", content: "What's the weather in San Francisco?" }]
).each_message do |message|
  puts message.content
end

Manual Loop

See the shared tool use concepts for the tool definition format and agentic loop pattern.

Prompt Caching

message = client.messages.create(
  model: :"claude-opus-4-7",
  max_tokens: 16000,
  system_: [
    { type: "text", text: long_system_prompt, cache_control: { type: "ephemeral" } }
  ],
  messages: [{ role: "user", content: "Summarize the key points" }]
)

For 1-hour TTL: cache_control: { type: "ephemeral", ttl: "1h" }. There's also a top-level cache_control: on messages.create that auto-places on the last cacheable block.

Verify hits via message.usage.cache_creation_input_tokens / message.usage.cache_read_input_tokens.


### ruby/managed-agents/README.md

[Download ruby/managed-agents/README.md](/skills/claude-api/ruby/managed-agents/README.md)

_Binary resource_

### shared/agent-design.md

[Download shared/agent-design.md](/skills/claude-api/shared/agent-design.md)

_Binary resource_

### shared/error-codes.md

[Download shared/error-codes.md](/skills/claude-api/shared/error-codes.md)

_Binary resource_

### shared/live-sources.md

[Download shared/live-sources.md](/skills/claude-api/shared/live-sources.md)

_Binary resource_

### shared/managed-agents-api-reference.md

[Download shared/managed-agents-api-reference.md](/skills/claude-api/shared/managed-agents-api-reference.md)

_Binary resource_

### shared/managed-agents-client-patterns.md

[Download shared/managed-agents-client-patterns.md](/skills/claude-api/shared/managed-agents-client-patterns.md)

_Binary resource_

### shared/managed-agents-core.md

[Download shared/managed-agents-core.md](/skills/claude-api/shared/managed-agents-core.md)

_Binary resource_

### shared/managed-agents-environments.md

[Download shared/managed-agents-environments.md](/skills/claude-api/shared/managed-agents-environments.md)

_Binary resource_

### shared/managed-agents-events.md

[Download shared/managed-agents-events.md](/skills/claude-api/shared/managed-agents-events.md)

_Binary resource_

### shared/managed-agents-memory.md

[Download shared/managed-agents-memory.md](/skills/claude-api/shared/managed-agents-memory.md)

_Binary resource_

### shared/managed-agents-multiagent.md

[Download shared/managed-agents-multiagent.md](/skills/claude-api/shared/managed-agents-multiagent.md)

_Binary resource_

### shared/managed-agents-onboarding.md

[Download shared/managed-agents-onboarding.md](/skills/claude-api/shared/managed-agents-onboarding.md)

_Binary resource_

### shared/managed-agents-outcomes.md

[Download shared/managed-agents-outcomes.md](/skills/claude-api/shared/managed-agents-outcomes.md)

_Binary resource_

### shared/managed-agents-overview.md

[Download shared/managed-agents-overview.md](/skills/claude-api/shared/managed-agents-overview.md)

_Binary resource_

### shared/managed-agents-tools.md

[Download shared/managed-agents-tools.md](/skills/claude-api/shared/managed-agents-tools.md)

_Binary resource_

### shared/managed-agents-webhooks.md

[Download shared/managed-agents-webhooks.md](/skills/claude-api/shared/managed-agents-webhooks.md)

```markdown
# Managed Agents — Webhooks

Anthropic can POST to your HTTPS endpoint when a Managed Agents resource changes state — an alternative to holding an SSE stream or polling. Payloads are **thin** (event type + resource IDs only); on receipt, fetch the resource for current state. Every delivery is HMAC-signed.

> **Direction matters.** This page covers *Anthropic → you* notifications about session/vault state. It does **not** cover *third-party → you* webhooks that *trigger* a session (e.g. a GitHub push handler that calls `sessions.create()`) — that's ordinary application code on your side with no Anthropic-specific wire format.

---

## Register an endpoint (Console only)

Console → **Manage → Webhooks**. There is no programmatic endpoint-management API yet. Secret rotation is supported from the same page.

| Field | Constraint |
|---|---|
| URL | HTTPS on port 443, publicly resolvable hostname |
| Event types | Subscribe per `data.type` — you only receive subscribed types (plus test events) |
| Signing secret | `whsec_`-prefixed, 32 bytes, **shown once at creation** — store it |

---

## Verify the signature

Every delivery is HMAC-signed. **Use the SDK's `client.beta.webhooks.unwrap()`** — it verifies the signature, rejects payloads more than ~5 minutes old, and returns the parsed event. It reads the `whsec_` secret from `ANTHROPIC_WEBHOOK_SIGNING_KEY`.

```python
import anthropic
from flask import Flask, request

client = anthropic.Anthropic()  # reads ANTHROPIC_WEBHOOK_SIGNING_KEY from env
app = Flask(__name__)


@app.route("/webhook", methods=["POST"])
def webhook():
    try:
        event = client.beta.webhooks.unwrap(
            request.get_data(as_text=True),
            headers=dict(request.headers),
        )
    except Exception:
        return "invalid signature", 400

    if event.id in seen_event_ids:  # dedupe retries — id is per-event, not per-delivery
        return "", 204
    seen_event_ids.add(event.id)

    match event.data.type:
        case "session.status_idled":
            session = client.beta.sessions.retrieve(event.data.id)
            notify_user(session)
        case "vault_credential.refresh_failed":
            alert_oncall(event.data.id)

    return "", 204

Payload envelope

{
  "type": "event",
  "id": "event_01ABC...",
  "created_at": "2026-03-18T14:05:22Z",
  "data": {
    "type": "session.status_idled",
    "id": "session_01XYZ...",
    "organization_id": "8a3d2f1e-...",
    "workspace_id": "c7b0e4d9-..."
  }
}

Switch on data.type, fetch the resource by data.id, return any 2xx to acknowledge. created_at is when the state transition happened, not when the webhook fired.

Supported `data.type` values

`data.type`	Fires when
`session.status_scheduled`	Session created and ready to accept events
`session.status_run_started`	Agent execution kicked off (every transition to `running`)
`session.status_idled`	Agent awaiting input (tool approval, custom tool result, or next message)
`session.status_terminated`	Session hit a terminal error
`session.thread_created`	Multiagent: coordinator opened a new subagent thread
`session.thread_idled`	Multiagent: a subagent thread is waiting for input
`session.outcome_evaluation_ended`	Outcome grader finished one iteration
`vault.archived`	Vault was archived
`vault.created`	Vault was created
`vault.deleted`	Vault was deleted
`vault_credential.archived`	Vault credential was archived
`vault_credential.created`	Vault credential was created
`vault_credential.deleted`	Vault credential was deleted
`vault_credential.refresh_failed`	MCP OAuth vault credential failed to refresh

Delivery behavior & pitfalls

No ordering guarantee. session.status_idled may arrive before session.outcome_evaluation_ended even if the evaluation finished first. Sort by envelope created_at if order matters.
Retries carry the same event.id. At least one retry on non-2xx. Dedupe on event.id.
3xx is failure. Redirects are not followed — update the URL in Console if your endpoint moves.
Auto-disable after ~20 consecutive failed deliveries, or immediately if the hostname resolves to a private IP or returns a redirect. Re-enable manually in Console.
Thin payload is intentional. Don't expect stop_reason, outcome_evaluations, credential secrets, etc. on the webhook body — fetch the resource.


### shared/model-migration.md

[Download shared/model-migration.md](/skills/claude-api/shared/model-migration.md)

_Binary resource_

### shared/models.md

[Download shared/models.md](/skills/claude-api/shared/models.md)

_Binary resource_

### shared/prompt-caching.md

[Download shared/prompt-caching.md](/skills/claude-api/shared/prompt-caching.md)

_Binary resource_

### shared/tool-use-concepts.md

[Download shared/tool-use-concepts.md](/skills/claude-api/shared/tool-use-concepts.md)

_Binary resource_

### typescript/claude-api/README.md

[Download typescript/claude-api/README.md](/skills/claude-api/typescript/claude-api/README.md)

_Binary resource_

### typescript/claude-api/batches.md

[Download typescript/claude-api/batches.md](/skills/claude-api/typescript/claude-api/batches.md)

```markdown
# Message Batches API — TypeScript

The Batches API (`POST /v1/messages/batches`) processes Messages API requests asynchronously at 50% of standard prices.

## Key Facts

- Up to 100,000 requests or 256 MB per batch
- Most batches complete within 1 hour; maximum 24 hours
- Results available for 29 days after creation
- 50% cost reduction on all token usage
- All Messages API features supported (vision, tools, caching, etc.)

---

## Create a Batch

```typescript
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const messageBatch = await client.messages.batches.create({
  requests: [
    {
      custom_id: "request-1",
      params: {
        model: "claude-opus-4-7",
        max_tokens: 16000,
        messages: [
          { role: "user", content: "Summarize climate change impacts" },
        ],
      },
    },
    {
      custom_id: "request-2",
      params: {
        model: "claude-opus-4-7",
        max_tokens: 16000,
        messages: [
          { role: "user", content: "Explain quantum computing basics" },
        ],
      },
    },
  ],
});

console.log(`Batch ID: ${messageBatch.id}`);
console.log(`Status: ${messageBatch.processing_status}`);

Poll for Completion

let batch;
while (true) {
  batch = await client.messages.batches.retrieve(messageBatch.id);
  if (batch.processing_status === "ended") break;
  console.log(
    `Status: ${batch.processing_status}, processing: ${batch.request_counts.processing}`,
  );
  await new Promise((resolve) => setTimeout(resolve, 60_000));
}

console.log("Batch complete!");
console.log(`Succeeded: ${batch.request_counts.succeeded}`);
console.log(`Errored: ${batch.request_counts.errored}`);

Retrieve Results

for await (const result of await client.messages.batches.results(
  messageBatch.id,
)) {
  switch (result.result.type) {
    case "succeeded":
      console.log(
        `[${result.custom_id}] ${result.result.message.content[0].text.slice(0, 100)}`,
      );
      break;
    case "errored":
      if (result.result.error.type === "invalid_request") {
        console.log(`[${result.custom_id}] Validation error - fix and retry`);
      } else {
        console.log(`[${result.custom_id}] Server error - safe to retry`);
      }
      break;
    case "expired":
      console.log(`[${result.custom_id}] Expired - resubmit`);
      break;
  }
}

Cancel a Batch

const cancelled = await client.messages.batches.cancel(messageBatch.id);
console.log(`Status: ${cancelled.processing_status}`); // "canceling"


### typescript/claude-api/files-api.md

[Download typescript/claude-api/files-api.md](/skills/claude-api/typescript/claude-api/files-api.md)

```markdown
# Files API — TypeScript

The Files API uploads files for use in Messages API requests. Reference files via `file_id` in content blocks, avoiding re-uploads across multiple API calls.

**Beta:** Pass `betas: ["files-api-2025-04-14"]` in your API calls (the SDK sets the required header automatically).

## Key Facts

- Maximum file size: 500 MB
- Total storage: 100 GB per organization
- Files persist until deleted
- File operations (upload, list, delete) are free; content used in messages is billed as input tokens
- Not available on Amazon Bedrock or Google Vertex AI

---

## Upload a File

```typescript
import Anthropic, { toFile } from "@anthropic-ai/sdk";
import fs from "fs";

const client = new Anthropic();

const uploaded = await client.beta.files.upload({
  file: await toFile(fs.createReadStream("report.pdf"), undefined, {
    type: "application/pdf",
  }),
  betas: ["files-api-2025-04-14"],
});

console.log(`File ID: ${uploaded.id}`);
console.log(`Size: ${uploaded.size_bytes} bytes`);

Use a File in Messages

PDF / Text Document

const response = await client.beta.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 16000,
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Summarize the key findings in this report." },
        {
          type: "document",
          source: { type: "file", file_id: uploaded.id },
          title: "Q4 Report",
          citations: { enabled: true },
        },
      ],
    },
  ],
  betas: ["files-api-2025-04-14"],
});

console.log(response.content[0].text);

Manage Files

List Files

const files = await client.beta.files.list({
  betas: ["files-api-2025-04-14"],
});
for (const f of files.data) {
  console.log(`${f.id}: ${f.filename} (${f.size_bytes} bytes)`);
}

Delete a File

await client.beta.files.delete("file_011CNha8iCJcU1wXNR6q4V8w", {
  betas: ["files-api-2025-04-14"],
});

Download a File

const response = await client.beta.files.download(
  "file_011CNha8iCJcU1wXNR6q4V8w",
  { betas: ["files-api-2025-04-14"] },
);
const content = Buffer.from(await response.arrayBuffer());
await fs.promises.writeFile("output.txt", content);


### typescript/claude-api/streaming.md

[Download typescript/claude-api/streaming.md](/skills/claude-api/typescript/claude-api/streaming.md)

```markdown
# Streaming — TypeScript

## Quick Start

```typescript
const stream = client.messages.stream({
  model: "claude-opus-4-7",
  max_tokens: 64000,
  messages: [{ role: "user", content: "Write a story" }],
});

for await (const event of stream) {
  if (
    event.type === "content_block_delta" &&
    event.delta.type === "text_delta"
  ) {
    process.stdout.write(event.delta.text);
  }
}

Handling Different Content Types

Opus 4.7 / Opus 4.6: Use thinking: {type: "adaptive"}. On older models, use thinking: {type: "enabled", budget_tokens: N} instead.

const stream = client.messages.stream({
  model: "claude-opus-4-7",
  max_tokens: 64000,
  thinking: { type: "adaptive" },
  messages: [{ role: "user", content: "Analyze this problem" }],
});

for await (const event of stream) {
  switch (event.type) {
    case "content_block_start":
      switch (event.content_block.type) {
        case "thinking":
          console.log("\n[Thinking...]");
          break;
        case "text":
          console.log("\n[Response:]");
          break;
      }
      break;
    case "content_block_delta":
      switch (event.delta.type) {
        case "thinking_delta":
          process.stdout.write(event.delta.thinking);
          break;
        case "text_delta":
          process.stdout.write(event.delta.text);
          break;
      }
      break;
  }
}

Streaming with Tool Use (Tool Runner)

Use the tool runner with stream: true. The outer loop iterates over tool runner iterations (messages), the inner loop processes stream events:

import Anthropic from "@anthropic-ai/sdk";
import { betaZodTool } from "@anthropic-ai/sdk/helpers/beta/zod";
import { z } from "zod";

const client = new Anthropic();

const getWeather = betaZodTool({
  name: "get_weather",
  description: "Get current weather for a location",
  inputSchema: z.object({
    location: z.string().describe("City and state, e.g., San Francisco, CA"),
  }),
  run: async ({ location }) => `72°F and sunny in ${location}`,
});

const runner = client.beta.messages.toolRunner({
  model: "claude-opus-4-7",
  max_tokens: 64000,
  tools: [getWeather],
  messages: [
    { role: "user", content: "What's the weather in Paris and London?" },
  ],
  stream: true,
});

// Outer loop: each tool runner iteration
for await (const messageStream of runner) {
  // Inner loop: stream events for this iteration
  for await (const event of messageStream) {
    switch (event.type) {
      case "content_block_delta":
        switch (event.delta.type) {
          case "text_delta":
            process.stdout.write(event.delta.text);
            break;
          case "input_json_delta":
            // Tool input being streamed
            break;
        }
        break;
    }
  }
}

Getting the Final Message

const stream = client.messages.stream({
  model: "claude-opus-4-7",
  max_tokens: 64000,
  messages: [{ role: "user", content: "Hello" }],
});

for await (const event of stream) {
  // Process events...
}

const finalMessage = await stream.finalMessage();
console.log(`Tokens used: ${finalMessage.usage.output_tokens}`);

Stream Event Types

Event Type	Description	When it fires
`message_start`	Contains message metadata	Once at the beginning
`content_block_start`	New content block beginning	When a text/tool_use block starts
`content_block_delta`	Incremental content update	For each token/chunk
`content_block_stop`	Content block complete	When a block finishes
`message_delta`	Message-level updates	Contains `stop_reason`, usage
`message_stop`	Message complete	Once at the end

Best Practices

Always flush output — Use process.stdout.write() for immediate display
Handle partial responses — If the stream is interrupted, you may have incomplete content
Track token usage — The message_delta event contains usage information
Use finalMessage() — Get the complete Anthropic.Message object even when streaming. Don't wrap .on() events in new Promise() — finalMessage() handles all completion/error/abort states internally
Buffer for web UIs — Consider buffering a few tokens before rendering to avoid excessive DOM updates
Use stream.on("text", ...) for deltas — The text event provides just the delta string, simpler than manually filtering content_block_delta events
For agentic loops with streaming — See the Streaming Manual Loop section in tool-use.md for combining stream() + finalMessage() with a tool-use loop

Raw SSE Format

If using raw HTTP (not SDKs), the stream returns Server-Sent Events:

event: message_start
data: {"type":"message_start","message":{"id":"msg_...","type":"message",...}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":12}}

event: message_stop
data: {"type":"message_stop"}


### typescript/claude-api/tool-use.md

[Download typescript/claude-api/tool-use.md](/skills/claude-api/typescript/claude-api/tool-use.md)

_Binary resource_

### typescript/managed-agents/README.md

[Download typescript/managed-agents/README.md](/skills/claude-api/typescript/managed-agents/README.md)

_Binary resource_

## See in GitHub

[See in GitHub](https://github.com/anthropics/skills/tree/main/claude-api)

Claude API

Table of Contents

Claude API