Technology · Claude API

Beginner

Claude API Basics

A quick reference for making your first Claude API call, picking a model, and reading the response.

TL;DR
  1. 01Install the Anthropic SDK and set your ANTHROPIC_API_KEY env var.
  2. 02Send messages to claude-opus-4-8 using client.messages.create().
  3. 03Check stop_reason before reading content to handle refusals safely.

Installation and Auth

  • Install the Python SDK with pip.

    pip install anthropic
    
  • Install the TypeScript/Node SDK with npm.

    npm install @anthropic-ai/sdk
    
  • Set your API key as an environment variable — never hard-code it.

    export ANTHROPIC_API_KEY="sk-ant-..."
    
  • The SDK reads ANTHROPIC_API_KEY automatically; no manual config needed.

    import anthropic
    client = anthropic.Anthropic()  # picks up env var
    
  • Pass the key explicitly if you manage multiple clients or secrets managers.

    client = anthropic.Anthropic(api_key="sk-ant-...")
    

Your First Request

  • Call messages.create() with a model, max_tokens, and a messages array.

    message = client.messages.create(
        model="claude-opus-4-8",
        max_tokens=1024,
        messages=[
            {"role": "user", "content": "Explain async/await in one paragraph."}
        ]
    )
    
  • In TypeScript, the call is identical — same parameters, same shape.

    import Anthropic from "@anthropic-ai/sdk";
    const client = new Anthropic();
    const message = await client.messages.create({
        model: "claude-opus-4-8",
        max_tokens: 1024,
        messages: [{ role: "user", content: "Explain async/await in one paragraph." }],
    });
    
  • Add a system parameter to give Claude a persistent role or persona.

    message = client.messages.create(
        model="claude-opus-4-8",
        max_tokens=1024,
        system="You are a concise technical writer.",
        messages=[{"role": "user", "content": "Explain async/await."}]
    )
    
  • Build multi-turn conversations by appending previous turns to messages.

    messages = [
        {"role": "user", "content": "What is a closure?"},
        {"role": "assistant", "content": "A closure captures variables from its enclosing scope."},
        {"role": "user", "content": "Give me a JavaScript example."},
    ]
    
  • max_tokens caps the response length — set it high enough for your task.

    # Short answers: 256–512  |  Long docs: 4096–8192  |  Max: 32000
    max_tokens=1024
    

Reading the Response

  • Access the text response via message.content[0].text.

    print(message.content[0].text)
    
  • Always check stop_reason before reading content — handle all cases.

    if message.stop_reason == "end_turn":
        print(message.content[0].text)
    elif message.stop_reason == "max_tokens":
        print("Response was cut off — increase max_tokens")
    elif message.stop_reason == "refusal":
        print("Request declined by safety classifier")
    
  • stop_reason values: end_turn (normal), max_tokens (truncated), tool_use (tool called), refusal (safety).

  • Check message.usage to track input and output token costs.

    print(message.usage.input_tokens)   # tokens in your prompt
    print(message.usage.output_tokens)  # tokens in the response
    
  • The response model (message.model) confirms which model actually served the request.

    print(message.model)  # e.g. "claude-opus-4-8"
    

Model Reference

  • Choose a model based on your speed, cost, and capability needs.

    Model ID Context Best for
    Claude Opus 4.8 claude-opus-4-8 1M tokens Hard reasoning, agentic tasks
    Claude Sonnet 4.6 claude-sonnet-4-6 1M tokens Balanced speed and quality
    Claude Haiku 4.5 claude-haiku-4-5 200K tokens Fast, low-cost classification
    Claude Fable 5 claude-fable-5 1M tokens Most capable, longest tasks
  • Default to claude-opus-4-8 for most new integrations.

  • Use claude-haiku-4-5 for high-volume, latency-sensitive tasks like tagging or routing.

  • Retrieve live model metadata — context window, capabilities — from the Models API.

    models = client.models.list()
    for m in models.data:
        print(m.id, m.context_window)
    
  • Never append date suffixes to model IDs — use the IDs exactly as shown above.

Tip: Pin a specific model ID in production so a model update never silently changes behaviour.

Warning: max_tokens is required — omitting it raises a validation error before the request is sent.

Claude Prompt Caching