FastMCP Cloud is here! Join the beta.
Get Started
Servers
- Essentials
- Core Components
- Advanced Features
- Authentication
Clients
Integrations
LLM Sampling
Handle server-initiated LLM sampling requests.
New in version: 2.0.0
MCP servers can request LLM completions from clients. The client handles these requests through a sampling handler callback.
Sampling Handler
Provide a sampling_handler
function when creating the client:
from fastmcp import Client
from fastmcp.client.sampling import (
SamplingMessage,
SamplingParams,
RequestContext,
)
async def sampling_handler(
messages: list[SamplingMessage],
params: SamplingParams,
context: RequestContext
) -> str:
# Your LLM integration logic here
# Extract text from messages and generate a response
return "Generated response based on the messages"
client = Client(
"my_mcp_server.py",
sampling_handler=sampling_handler,
)
Handler Parameters
The sampling handler receives three parameters:
Sampling Handler Parameters
Show attributes
Show attributes
The messages to sample from
The server’s preferences for which model to select. The client MAY ignore these preferences.
An optional system prompt the server wants to use for sampling.
A request to include context from one or more MCP servers (including the caller), to be attached to the prompt.
The sampling temperature.
The maximum number of tokens to sample.
The stop sequences to use for sampling.
Optional metadata to pass through to the LLM provider.
Basic Example
from fastmcp import Client
from fastmcp.client.sampling import SamplingMessage, SamplingParams, RequestContext
async def basic_sampling_handler(
messages: list[SamplingMessage],
params: SamplingParams,
context: RequestContext
) -> str:
# Extract message content
conversation = []
for message in messages:
content = message.content.text if hasattr(message.content, 'text') else str(message.content)
conversation.append(f"{message.role}: {content}")
# Use the system prompt if provided
system_prompt = params.systemPrompt or "You are a helpful assistant."
# Here you would integrate with your preferred LLM service
# This is just a placeholder response
return f"Response based on conversation: {' | '.join(conversation)}"
client = Client(
"my_mcp_server.py",
sampling_handler=basic_sampling_handler
)