How to Use LLMs from Different LLM Providers#
Agent Spec supports several LLM providers. LlmConfig can be used directly with the api_provider
field to describe any provider, or you can use a dedicated subclass for provider-specific configuration.
The available LLM configurations are:
LlmConfig (generic, provider-agnostic)
Their configuration is specified directly in their respective class constructor. This guide will show you how to configure LLMs from different LLM providers with examples and notes on usage.
Configure retry behavior for remote LLM calls#
All LlmConfig subclasses accept an optional retry_policy parameter.
Use it to configure retry attempts, per-request timeouts, and backoff behavior
for transient failures when calling remote LLM endpoints.
For example, you can attach a retry policy directly to a VllmConfig:
from pyagentspec import RetryPolicy
from pyagentspec.llms import LlmGenerationConfig, VllmConfig
retry_policy = RetryPolicy(
max_attempts=4,
request_timeout=30.0,
initial_retry_delay=0.5,
max_retry_delay=8.0,
)
llm_config_with_retry_policy = VllmConfig(
name="vllm-llama-4-maverick-with-retries",
model_id="llama-4-maverick",
url="http://url.to.my.vllm.server/llama4mav",
default_generation_parameters=LlmGenerationConfig(
max_tokens=512, temperature=1.0, top_p=1.0
),
retry_policy=retry_policy,
)
API Reference: LlmConfig, RetryPolicy
LlmConfig (Generic)#
LlmConfig can be used directly to describe any LLM without requiring a provider-specific subclass.
This is useful when you want to describe an LLM from a provider that does not have a dedicated configuration class,
or when you want a simple, portable configuration.
Parameters
- model_id: str#
Identifier of the model to use, as expected by the selected API provider.
- provider: str, null#
The model provider, i.e. who made the model (e.g.
"openai","meta","anthropic","cohere").
- api_provider: str, null#
The API provider, i.e. who serves the API (e.g.
"openai","oci","vllm","ollama","aws_bedrock","vertex_ai").
- api_type: str, null#
The API format to use to interact with the LLM (e.g.
"chat_completions","responses").
- url: str, null#
URL of the API endpoint (e.g.
"https://api.openai.com/v1").
- api_key: str, null#
An optional API key for the remote LLM. When exported, the value is replaced by a reference.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
Examples
from pyagentspec.llms import LlmConfig
from pyagentspec.llms import LlmGenerationConfig
generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.7)
llm = LlmConfig(
name="openai-gpt4o",
model_id="gpt-4o",
provider="openai",
api_provider="openai",
api_type="chat_completions",
default_generation_parameters=generation_config,
)
OciGenAiConfig#
OCI GenAI Configuration refers to model served by OCI Generative AI.
Parameters
- model_id: str#
Name of the model to use. A list of the available models is given in Oracle OCI Documentation under the Model Retirement Dates (On-Demand Mode) section.
- compartment_id: str#
The OCID (Oracle Cloud Identifier) of a compartment within your tenancy.
- serving_mode: str#
The mode how the model specified is served:
ON_DEMAND: the model is hosted in a shared environment;DEDICATED: the model is deployed in a customer-dedicated environment.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
Example:
default_generation_parameters = LlmGenerationConfig(max_tokens=256, temperature=0.8)
- client_config: OciClientConfig, null#
OCI client config to authenticate the OCI service. See the below examples for the usage and more information.
OCI Client Configuration#
OCI GenAI models require a client configuration that contains all the settings needed to perform
the authentication to use OCI services. The OciClientConfig holds these settings.
Parameters
- service_endpoint: str#
The endpoint URL for the OCIGenAI service. Make sure you set the region right. For doing so, make sure that the Region where your private key is created, is aligned with the region mention in the
service_endpoint.
- auth_type: str#
The authentication type to use, e.g.,
API_KEY,SECURITY_TOKEN,INSTANCE_PRINCIPAL(It means that you need to execute the code from a compartment enabled for OCIGenAI.),RESOURCE_PRINCIPAL.
Based on the type of authentication the user wants to adopt, different specifications of the OciClientConfig
are defined. Indeed, the OciClientConfig component is abstract, and should not be used directly.
In the following sections we show what client extensions are available and their specific parameters.
Examples
from pyagentspec.llms import OciGenAiConfig
from pyagentspec.llms import LlmGenerationConfig
from pyagentspec.llms.ociclientconfig import OciClientConfigWithApiKey
# Get the list of available models from:
# https://docs.oracle.com/en-us/iaas/Content/generative-ai/deprecating.htm#
# under the "Model Retirement Dates (On-Demand Mode)" section.
OCIGENAI_MODEL_ID = "xai.grok-3"
# Typical service endpoint for OCI GenAI service inference
# <oci region> can be "us-chicago-1" and can also be found in your ~/.oci/config file
OCIGENAI_ENDPOINT = "https://inference.generativeai.<oci region>.oci.oraclecloud.com"
# <compartment_id> can be obtained from your personal OCI account (not the key config file).
# Please find it under "Identity > Compartments" on the OCI console website after logging in to your user account.
COMPARTMENT_ID = "ocid1.compartment.oc1..<compartment_id>"
generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.8)
llm = OciGenAiConfig(
name="oci-genai-grok3",
model_id=OCIGENAI_MODEL_ID,
compartment_id=COMPARTMENT_ID,
client_config=OciClientConfigWithApiKey(
name="client_config",
service_endpoint=OCIGENAI_ENDPOINT,
auth_file_location="~/.oci/config",
auth_profile="DEFAULT",
),
default_generation_parameters=generation_config,
)
OciClientConfigWithSecurityToken#
Client configuration that should be used if users want to use authentication through security token.
Parameters
- auth_file_location: str#
The location of the authentication file from which the authentication information should be retrieved. The default location is
~/.oci/config.
- auth_profile: str#
The name of the profile to use, among the ones defined in the authentication file. The default profile name is
DEFAULT.
OciClientConfigWithApiKey#
Client configuration that should be used if users want to use authentication with API key.
The parameters required are the same defined for the OciClientConfigWithSecurityToken.
OciClientConfigWithInstancePrincipal#
Client configuration that should be used if users want to use instance principal authentication. No additional parameters are required.
OciClientConfigWithResourcePrincipal#
Client configuration that should be used if users want to use resource principal authentication. No additional parameters are required.
OpenAiConfig#
OpenAI Models are powered by OpenAI.
You can refer to one of those models by using the OpenAiConfig Component.
Parameters
- model_id: str#
Name of the model to use.
- api_type: str#
The API type that should be used. Can be either
chat_completionsorresponses.
- api_key: str, null#
An optional api key for the authentication with the OpenAI endpoint.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
Important
Ensure that the OPENAI_API_KEY is set beforehand
to access this model. A list of available OpenAI models can be found at
the following link: OpenAI Models.
Examples
from pyagentspec.llms import OpenAiConfig
generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.7, top_p=0.9)
llm = OpenAiConfig(
name="openai-gpt-5",
model_id="gpt-5",
default_generation_parameters=generation_config,
api_key="optional_api_key",
)
GeminiConfig#
Gemini models can be configured through GeminiConfig.
Agent Spec supports both Google AI Studio and Google Vertex AI authentication modes.
Gemini authentication is modeled as a nested auth component, similar to OCI client_config.
The auth component itself remains inline during serialization. When api_key or
credentials is provided explicitly, only that sensitive field is externalized and must
be supplied through components_registry when loading the configuration back.
Parameters
- model_id: str#
Name of the model to use, for example
gemini-2.5-flashorgemini-2.0-flash-lite.
- auth: GeminiAuthConfig#
Required authentication component for Gemini. As with other Agent Spec components, auth configs need a
name. UseGeminiAIStudioAuthConfig(name="gemini-aistudio-auth")if you want runtimes to loadGEMINI_API_KEYfrom the environment, orGeminiVertexAIAuthConfig(name="gemini-vertex-auth", ...)for Vertex AI. The auth component remains inline when serialized. Ifapi_keyorcredentialsis set explicitly, only that sensitive field is serialized as a reference.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
Google AI Studio authentication#
Use GeminiAIStudioAuthConfig when connecting through Google AI Studio.
Parameters
- api_key: str, null#
Optional Gemini API key. If omitted, runtimes may load it from
GEMINI_API_KEY. If provided explicitly, only theapi_keyfield is externalized during serialization and must be supplied separately when deserializing.
Example
from pyagentspec.llms import GeminiConfig
from pyagentspec.llms.geminiauthconfig import GeminiAIStudioAuthConfig
generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.7, top_p=0.9)
llm = GeminiConfig(
name="gemini-aistudio-flash",
model_id="gemini-2.5-flash",
auth=GeminiAIStudioAuthConfig(
name="gemini-aistudio-auth"
# Optional: if api_key is omitted, runtimes may load GEMINI_API_KEY from the environment.
),
default_generation_parameters=generation_config,
)
Vertex AI authentication#
Use GeminiVertexAIAuthConfig when connecting through Google Vertex AI.
Parameters
- project_id: str, null#
Optional Google Cloud project identifier. In practice, you may still need to set this explicitly when ADC provides credentials but does not expose a default project.
- location: str#
Vertex AI location or region. Defaults to
global.
- credentials: str | dict, null#
Optional local file path (
str) to a Google Cloud JSON credential file, such as a service-account key file, or an inlinedictcontaining the parsed JSON contents of that file. When omitted, runtimes may rely on Google Application Default Credentials (ADC), such asGOOGLE_APPLICATION_CREDENTIALS, credentials made available through the local Google Cloud environment, or an attached service account. See Google Cloud authentication docs for details. This does not guarantee thatproject_idcan also be inferred automatically. If provided explicitly, only thecredentialsfield is externalized during serialization. Non-secret auth settings such asproject_idandlocationremain inline in the main config.
Example
from pyagentspec.llms.geminiauthconfig import GeminiVertexAIAuthConfig
generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.4, top_p=0.95)
llm = GeminiConfig(
name="gemini-vertex-flash",
model_id="gemini-2.0-flash-lite",
auth=GeminiVertexAIAuthConfig(
name="gemini-vertex-auth",
# Often still required even when ADC supplies the credentials.
project_id="my-gcp-project",
location="global",
# Optional: explicit credentials can be provided when ADC is not available.
),
default_generation_parameters=generation_config,
)
OpenAiCompatibleConfig#
OpenAI Compatible LLMs are all those models that are served through OpenAI APIs, either responses or completions.
The OpenAiCompatibleConfig allows users to use this type of models in their agents and flows.
Parameters
- model_id: str#
Name of the model to use.
- url: str#
Hostname and port of the server exposing the OpenAI-compatible endpoint.
- api_type: str#
The API type that should be used. Can be either
chat_completionsorresponses.
- api_key: str, null#
An optional api key if the remote server requires it.
- key_file: str, null#
Path to an optional client private key file in PEM format.
- cert_file: str, null#
Path to an optional client certificate chain file in PEM format.
- ca_file: str, null#
Path to an optional trusted CA certificate file in PEM format, used to verify the server.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
The certificate fields are useful when the remote endpoint is exposed over HTTPS with a private CA
or when it requires mutual TLS (mTLS). Like api_key, these values are treated as sensitive
fields during serialization.
Examples
from pyagentspec.llms import OpenAiCompatibleConfig
from pyagentspec.llms.openaicompatibleconfig import OpenAIAPIType
generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)
llm = OpenAiCompatibleConfig(
name="openai-compatible-llama-4-maverick",
model_id="llama-4-maverick",
url="https://url.to.my.openai.compatible.server/llama4mav",
api_type=OpenAIAPIType.RESPONSES,
api_key="optional_api_key",
key_file="/path/to/client.key",
cert_file="/path/to/client.pem",
ca_file="/path/to/ca.pem",
default_generation_parameters=generation_config,
)
VllmConfig#
vLLM Models are models hosted with a vLLM server.
The VllmConfig allows users to use this type of models in their agents and flows.
Parameters
- model_id: str#
Name of the model to use.
- url: str#
Hostname and port of the vLLM server where the model is hosted.
- api_type: str#
The API type that should be used. Can be either
chat_completionsorresponses.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
- api_key: str, null#
An optional api key if the remote vllm server requires it.
The VllmConfig inherits from OpenAiCompatibleConfig, so it also supports the optional
key_file, cert_file, and ca_file parameters for HTTPS and mTLS connections.
Examples
from pyagentspec.llms import VllmConfig
generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)
llm = VllmConfig(
name="vllm-llama-4-maverick",
model_id="llama-4-maverick",
url="http://url.to.my.vllm.server/llama4mav",
default_generation_parameters=generation_config,
api_key="optional_api_key",
)
OllamaConfig#
Ollama Models are powered by a locally hosted Ollama server.
The OllamaConfig allows users to use this type of models in their agents and flows.
Parameters
- model_id: str#
Name of the model to use.
- url: str#
Hostname and port of the vLLM server where the model is hosted.
- api_type: str#
The API type that should be used. Can be either
chat_completionsorresponses.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
- api_key: str, null#
An optional api key if the ollama server requires it.
Examples
from pyagentspec.llms import OllamaConfig
generation_config = LlmGenerationConfig(max_tokens=512, temperature=0.9, top_p=0.9)
llm = OllamaConfig(
name="ollama-llama-4",
model_id="llama-4-maverick",
url="http://url.to.my.ollama.server/llama4mav",
default_generation_parameters=generation_config,
api_key="optional_api_key",
)
Recap#
This guide provides detailed descriptions of each model type supported by Agent Spec, demonstrating how to declare them using PyAgentSpec syntax.
Below is the complete code from this guide.
1from pyagentspec.llms import LlmConfig
2from pyagentspec.llms import LlmGenerationConfig
3
4generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.7)
5
6llm = LlmConfig(
7 name="openai-gpt4o",
8 model_id="gpt-4o",
9 provider="openai",
10 api_provider="openai",
11 api_type="chat_completions",
12 default_generation_parameters=generation_config,
13)
14
15from pyagentspec.llms import OciGenAiConfig
16from pyagentspec.llms import LlmGenerationConfig
17from pyagentspec.llms.ociclientconfig import OciClientConfigWithApiKey
18
19# Get the list of available models from:
20# https://docs.oracle.com/en-us/iaas/Content/generative-ai/deprecating.htm#
21# under the "Model Retirement Dates (On-Demand Mode)" section.
22OCIGENAI_MODEL_ID = "xai.grok-3"
23# Typical service endpoint for OCI GenAI service inference
24# <oci region> can be "us-chicago-1" and can also be found in your ~/.oci/config file
25OCIGENAI_ENDPOINT = "https://inference.generativeai.<oci region>.oci.oraclecloud.com"
26# <compartment_id> can be obtained from your personal OCI account (not the key config file).
27# Please find it under "Identity > Compartments" on the OCI console website after logging in to your user account.
28COMPARTMENT_ID = "ocid1.compartment.oc1..<compartment_id>"
29
30generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.8)
31
32llm = OciGenAiConfig(
33 name="oci-genai-grok3",
34 model_id=OCIGENAI_MODEL_ID,
35 compartment_id=COMPARTMENT_ID,
36 client_config=OciClientConfigWithApiKey(
37 name="client_config",
38 service_endpoint=OCIGENAI_ENDPOINT,
39 auth_file_location="~/.oci/config",
40 auth_profile="DEFAULT",
41 ),
42 default_generation_parameters=generation_config,
43)
44
45from pyagentspec.llms import OpenAiCompatibleConfig
46from pyagentspec.llms.openaicompatibleconfig import OpenAIAPIType
47
48generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)
49
50llm = OpenAiCompatibleConfig(
51 name="openai-compatible-llama-4-maverick",
52 model_id="llama-4-maverick",
53 url="https://url.to.my.openai.compatible.server/llama4mav",
54 api_type=OpenAIAPIType.RESPONSES,
55 api_key="optional_api_key",
56 key_file="/path/to/client.key",
57 cert_file="/path/to/client.pem",
58 ca_file="/path/to/ca.pem",
59 default_generation_parameters=generation_config,
60)
61
62from pyagentspec.llms import VllmConfig
63
64generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)
65
66llm = VllmConfig(
67 name="vllm-llama-4-maverick",
68 model_id="llama-4-maverick",
69 url="http://url.to.my.vllm.server/llama4mav",
70 default_generation_parameters=generation_config,
71)
72
73from pyagentspec.llms import OpenAiConfig
74
75generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.7, top_p=0.9)
76
77llm = OpenAiConfig(
78 name="openai-gpt-5",
79 model_id="gpt-5",
80 default_generation_parameters=generation_config,
81)
82
83from pyagentspec.llms import GeminiConfig
84from pyagentspec.llms.geminiauthconfig import GeminiAIStudioAuthConfig, GeminiVertexAIAuthConfig
85
86generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.7, top_p=0.9)
87
88llm = GeminiConfig(
89 name="gemini-aistudio-flash",
90 model_id="gemini-2.5-flash",
91 auth=GeminiAIStudioAuthConfig(
92 name="gemini-aistudio-auth"
93 # Optional: if api_key is omitted, runtimes may load GEMINI_API_KEY from the environment.
94 ),
95 default_generation_parameters=generation_config,
96)
97
98llm = GeminiConfig(
99 name="gemini-vertex-flash",
100 model_id="gemini-2.0-flash-lite",
101 auth=GeminiVertexAIAuthConfig(
102 name="gemini-vertex-auth",
103 # Often still required even when ADC supplies the credentials.
104 project_id="my-gcp-project",
105 location="global",
106 # Optional: explicit credentials can be provided when ADC is not available.
107 ),
108 default_generation_parameters=generation_config,
109)
110
111from pyagentspec.llms import OllamaConfig
112
113generation_config = LlmGenerationConfig(max_tokens=512, temperature=0.9, top_p=0.9)
114
115llm = OllamaConfig(
116 name="ollama-llama-4",
117 model_id="llama-4-maverick",
118 url="http://url.to.my.ollama.server/llama4mav",
119 default_generation_parameters=generation_config
120)
Next steps#
Having learned how to configure LLMs from different providers, you may now proceed to: