New in version: 2.0.0
LLM sampling allows MCP tools to request LLM text generation based on provided messages. By default, sampling requests are sent to the client’s LLM, but you can also configure a fallback handler or always use a specific LLM provider. This is useful when tools need to leverage LLM capabilities to process data, generate responses, or perform text-based analysis.
Why Use LLM Sampling?
LLM sampling enables tools to:- Leverage AI capabilities: Use the client’s LLM for text generation and analysis
- Offload complex reasoning: Let the LLM handle tasks requiring natural language understanding
- Generate dynamic content: Create responses, summaries, or transformations based on data
- Maintain context: Use the same LLM instance that the user is already interacting with
Basic Usage
Usectx.sample() to request text generation from the client’s LLM:
Method Signature
Context Sampling Method
Request text generation from the client’s LLM
Simple Text Generation
Basic Prompting
Generate text with simple string prompts:System Prompt
Use system prompts to guide the LLM’s behavior:Model Preferences
Specify model preferences for different use cases:Complex Message Structures
Use structured messages for more complex interactions:Sampling Fallback Handler
Client support for sampling is optional. If the client does not support sampling, the server will report an error indicating that the client does not support sampling. However, you can provide asampling_handler to the FastMCP server, which sends sampling requests directly to an LLM provider instead of routing through the client. The sampling_handler_behavior parameter controls when this handler is used:
"fallback"(default): Uses the handler only when the client doesn’t support sampling. Requests go to the client first, falling back to the handler if needed."always": Always uses the handler, bypassing the client entirely. Useful when you want full control over the LLM used for sampling.
Fallback Mode (Default)
Uses the handler only when the client doesn’t support sampling:Always Mode
Always uses the handler, bypassing the client:Client Requirements
By default, LLM sampling requires client support:- Clients must implement sampling handlers to process requests (see Client Sampling)
- If the client doesn’t support sampling and no fallback handler is configured,
ctx.sample()will raise an error - Configure a
sampling_handlerwithsampling_handler_behavior="fallback"to automatically handle clients that don’t support sampling - Use
sampling_handler_behavior="always"to completely bypass the client and control which LLM is used

