Skip to main content
New in version 2.0.0 MCP servers can request LLM completions from clients. The client handles these requests through a sampling handler callback.

Sampling Handler

Provide a sampling_handler function when creating the client:
from fastmcp import Client
from fastmcp.client.sampling import (
    SamplingMessage,
    SamplingParams,
    RequestContext,
)

async def sampling_handler(
    messages: list[SamplingMessage],
    params: SamplingParams,
    context: RequestContext
) -> str:
    # Your LLM integration logic here
    # Extract text from messages and generate a response
    return "Generated response based on the messages"

client = Client(
    "my_mcp_server.py",
    sampling_handler=sampling_handler,
)

Handler Parameters

The sampling handler receives three parameters:

Sampling Handler Parameters

SamplingMessage
Sampling Message Object
SamplingParams
Sampling Parameters Object
RequestContext
Request Context Object

Basic Example

from fastmcp import Client
from fastmcp.client.sampling import SamplingMessage, SamplingParams, RequestContext

async def basic_sampling_handler(
    messages: list[SamplingMessage],
    params: SamplingParams,
    context: RequestContext
) -> str:
    # Extract message content
    conversation = []
    for message in messages:
        content = message.content.text if hasattr(message.content, 'text') else str(message.content)
        conversation.append(f"{message.role}: {content}")

    # Use the system prompt if provided
    system_prompt = params.systemPrompt or "You are a helpful assistant."

    # Here you would integrate with your preferred LLM service
    # This is just a placeholder response
    return f"Response based on conversation: {' | '.join(conversation)}"

client = Client(
    "my_mcp_server.py",
    sampling_handler=basic_sampling_handler
)
If the client doesn’t provide a sampling handler, servers can optionally configure a fallback handler. See Server Sampling for details.

Sampling Capabilities

When you provide a sampling_handler, FastMCP automatically advertises full sampling capabilities to the server, including tool support. To disable tool support (for simpler handlers that don’t support tools), pass sampling_capabilities explicitly:
from mcp.types import SamplingCapability

client = Client(
    "my_mcp_server.py",
    sampling_handler=basic_handler,
    sampling_capabilities=SamplingCapability(),  # No tool support
)

Built-in Handlers

FastMCP provides built-in sampling handlers for OpenAI and Anthropic APIs. These handlers support the full sampling API including tool use, handling message conversion and response formatting automatically.

OpenAI Handler

New in version 2.11.0 The OpenAI handler works with OpenAI’s API and any OpenAI-compatible provider:
from fastmcp import Client
from fastmcp.client.sampling.handlers.openai import OpenAISamplingHandler

client = Client(
    "my_mcp_server.py",
    sampling_handler=OpenAISamplingHandler(default_model="gpt-4o"),
)
For OpenAI-compatible APIs (like local models), pass a custom client:
from openai import AsyncOpenAI

client = Client(
    "my_mcp_server.py",
    sampling_handler=OpenAISamplingHandler(
        default_model="llama-3.1-70b",
        client=AsyncOpenAI(base_url="http://localhost:8000/v1"),
    ),
)
Install the OpenAI handler with pip install fastmcp[openai].

Anthropic Handler

New in version 2.14.1 The Anthropic handler uses Claude models via the Anthropic API:
from fastmcp import Client
from fastmcp.client.sampling.handlers.anthropic import AnthropicSamplingHandler

client = Client(
    "my_mcp_server.py",
    sampling_handler=AnthropicSamplingHandler(default_model="claude-sonnet-4-5"),
)
You can pass a custom client for advanced configuration:
from anthropic import AsyncAnthropic

client = Client(
    "my_mcp_server.py",
    sampling_handler=AnthropicSamplingHandler(
        default_model="claude-sonnet-4-5",
        client=AsyncAnthropic(),  # Uses ANTHROPIC_API_KEY env var
    ),
)
Install the Anthropic handler with pip install fastmcp[anthropic].

Tool Execution

Tool execution happens on the server side. The client’s role is to pass tools to the LLM and return the LLM’s response (which may include tool use requests). The server then executes the tools and may send follow-up sampling requests with tool results.
To implement a custom sampling handler, see the handler source code as a reference.