Claude API Basics

Installation and Auth

Install the Python SDK with pip.
```
pip install anthropic
```
Install the TypeScript/Node SDK with npm.
```
npm install @anthropic-ai/sdk
```
Set your API key as an environment variable — never hard-code it.
```
export ANTHROPIC_API_KEY="sk-ant-..."
```

The SDK reads ANTHROPIC_API_KEY automatically; no manual config needed.

import anthropic
client = anthropic.Anthropic()  # picks up env var

Pass the key explicitly if you manage multiple clients or secrets managers.
```
client = anthropic.Anthropic(api_key="sk-ant-...")
```

Your First Request

Call messages.create() with a model, max_tokens, and a messages array.

message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain async/await in one paragraph."}
    ]
)

In TypeScript, the call is identical — same parameters, same shape.

import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const message = await client.messages.create({
    model: "claude-opus-4-8",
    max_tokens: 1024,
    messages: [{ role: "user", content: "Explain async/await in one paragraph." }],
});

Add a system parameter to give Claude a persistent role or persona.

message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    system="You are a concise technical writer.",
    messages=[{"role": "user", "content": "Explain async/await."}]
)

Build multi-turn conversations by appending previous turns to messages.

messages = [
    {"role": "user", "content": "What is a closure?"},
    {"role": "assistant", "content": "A closure captures variables from its enclosing scope."},
    {"role": "user", "content": "Give me a JavaScript example."},
]

max_tokens caps the response length — set it high enough for your task.

# Short answers: 256–512  |  Long docs: 4096–8192  |  Max: 32000
max_tokens=1024

Reading the Response

Access the text response via message.content[0].text.
```
print(message.content[0].text)
```

Always check stop_reason before reading content — handle all cases.

if message.stop_reason == "end_turn":
    print(message.content[0].text)
elif message.stop_reason == "max_tokens":
    print("Response was cut off — increase max_tokens")
elif message.stop_reason == "refusal":
    print("Request declined by safety classifier")

stop_reason values: end_turn (normal), max_tokens (truncated), tool_use (tool called), refusal (safety).

Check message.usage to track input and output token costs.

print(message.usage.input_tokens)   # tokens in your prompt
print(message.usage.output_tokens)  # tokens in the response

The response model (message.model) confirms which model actually served the request.
```
print(message.model)  # e.g. "claude-opus-4-8"
```

Model Reference

Choose a model based on your speed, cost, and capability needs.

Model	ID	Context	Best for
Claude Opus 4.8	`claude-opus-4-8`	1M tokens	Hard reasoning, agentic tasks
Claude Sonnet 4.6	`claude-sonnet-4-6`	1M tokens	Balanced speed and quality
Claude Haiku 4.5	`claude-haiku-4-5`	200K tokens	Fast, low-cost classification
Claude Fable 5	`claude-fable-5`	1M tokens	Most capable, longest tasks

Default to claude-opus-4-8 for most new integrations.
Use claude-haiku-4-5 for high-volume, latency-sensitive tasks like tagging or routing.

Retrieve live model metadata — context window, capabilities — from the Models API.

models = client.models.list()
for m in models.data:
    print(m.id, m.context_window)

Never append date suffixes to model IDs — use the IDs exactly as shown above.

Tip: Pin a specific model ID in production so a model update never silently changes behaviour.

Warning: max_tokens is required — omitting it raises a validation error before the request is sent.