Request
This endpoint expects an object.
backgroundboolean or nullOptional
cache_controlobjectOptional
Enable automatic prompt caching. When set at the top level, the system automatically applies cache breakpoints to the last cacheable block in the request. Currently supported for Anthropic Claude models.
frequency_penaltydouble or nullOptional
image_configstring or double or list of anyOptional
includelist of enums or nullOptional
inputstring or list of objectsOptional
Input for a response request - can be a string or array of items
instructionsstring or nullOptional
max_output_tokensinteger or nullOptional
max_tool_callsinteger or nullOptional
metadatamap from strings to stringsOptional
Metadata key-value pairs for the request. Keys must be ≤64 characters and cannot contain brackets. Values must be ≤512 characters. Maximum 16 pairs allowed.
modalitieslist of enumsOptional
Output modalities for the response. Supported values are "text" and "image".
modelslist of stringsOptional
parallel_tool_callsboolean or nullOptional
pluginslist of objectsOptional
Plugins you want to enable for this request, including their settings.
presence_penaltydouble or nullOptional
previous_response_idstring or nullOptional
prompt_cache_keystring or nullOptional
providerobjectOptional
When multiple model providers are available, optionally indicate your routing preference.
reasoningobjectOptional
Configuration for reasoning mode in the response
safety_identifierstring or nullOptional
service_tierenum or nullOptionalDefaults to auto
session_idstringOptional<=256 characters
A unique identifier for grouping related requests (e.g., a conversation or agent workflow). When provided, OpenRouter uses it as the sticky routing key, routing all requests in the session to the same provider to maximize prompt cache hits. Also used for observability grouping. If provided in both the request body and the x-session-id header, the body value takes precedence. Maximum of 256 characters.
stop_server_tools_whenlist of objectsOptional
Stop conditions for the server-tool agent loop. Any condition firing halts the loop (OR logic). When set, this overrides max_tool_calls.
streambooleanOptionalDefaults to false
temperaturedouble or nullOptional
textobjectOptional
Text output configuration including format and verbosity
tool_choiceenum or objectOptional
toolslist of objectsOptional
top_logprobsinteger or nullOptional
top_pdouble or nullOptional
traceobjectOptional
Metadata for observability and tracing. Known keys (trace_id, trace_name, span_name, generation_name, parent_span_id) have special handling. Additional keys are passed through as custom metadata to configured broadcast destinations.
userstringOptional<=256 characters
A unique identifier representing your end-user, which helps distinguish between different users of your app. This allows your app to identify specific users in case of abuse reports, preventing your entire app from being affected by the actions of individual users. Maximum of 256 characters.
Response
Successful response
completed_atinteger or null
errorobject
Error information returned from the API
frequency_penaltydouble or null
instructionsstring or list of objects or any
metadatamap from strings to strings
Metadata key-value pairs for the request. Keys must be ≤64 characters and cannot contain brackets. Values must be ≤512 characters. Maximum 16 pairs allowed.
parallel_tool_callsboolean
presence_penaltydouble or null
temperaturedouble or null
tool_choiceenum or object
backgroundboolean or null
max_output_tokensinteger or null
max_tool_callsinteger or null
previous_response_idstring or null
prompt_cache_keystring or null
safety_identifierstring or null
textobject
Text output configuration including format and verbosity
usageobject
Token usage information for the response
openrouter_metadataobject