How to Use LLMs from Different LLM Providers#
Agent Spec supports several LLM providers, each one having its own LlmConfig component. The available LLMs are:
Their configuration is specified directly in their respective class constructor. This guide will show you how to configure LLMs from different LLM providers with examples and notes on usage.
Configure retry behavior for remote LLM calls#
All LlmConfig subclasses accept an optional retry_policy parameter.
Use it to configure retry attempts, per-request timeouts, and backoff behavior
for transient failures when calling remote LLM endpoints.
For example, you can attach a retry policy directly to a VllmConfig:
from pyagentspec import RetryPolicy
from pyagentspec.llms import LlmGenerationConfig, VllmConfig
retry_policy = RetryPolicy(
max_attempts=4,
request_timeout=30.0,
initial_retry_delay=0.5,
max_retry_delay=8.0,
)
llm_config_with_retry_policy = VllmConfig(
name="vllm-llama-4-maverick-with-retries",
model_id="llama-4-maverick",
url="http://url.to.my.vllm.server/llama4mav",
default_generation_parameters=LlmGenerationConfig(
max_tokens=512, temperature=1.0, top_p=1.0
),
retry_policy=retry_policy,
)
API Reference: LlmConfig, RetryPolicy
OciGenAiConfig#
OCI GenAI Configuration refers to model served by OCI Generative AI.
Parameters
- model_id: str#
Name of the model to use. A list of the available models is given in Oracle OCI Documentation under the Model Retirement Dates (On-Demand Mode) section.
- compartment_id: str#
The OCID (Oracle Cloud Identifier) of a compartment within your tenancy.
- serving_mode: str#
The mode how the model specified is served:
ON_DEMAND: the model is hosted in a shared environment;DEDICATED: the model is deployed in a customer-dedicated environment.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
Example:
default_generation_parameters = LlmGenerationConfig(max_tokens=256, temperature=0.8)
- client_config: OciClientConfig, null#
OCI client config to authenticate the OCI service. See the below examples for the usage and more information.
OCI Client Configuration#
OCI GenAI models require a client configuration that contains all the settings needed to perform
the authentication to use OCI services. The OciClientConfig holds these settings.
Parameters
- service_endpoint: str#
The endpoint URL for the OCIGenAI service. Make sure you set the region right. For doing so, make sure that the Region where your private key is created, is aligned with the region mention in the
service_endpoint.
- auth_type: str#
The authentication type to use, e.g.,
API_KEY,SECURITY_TOKEN,INSTANCE_PRINCIPAL(It means that you need to execute the code from a compartment enabled for OCIGenAI.),RESOURCE_PRINCIPAL.
Based on the type of authentication the user wants to adopt, different specifications of the OciClientConfig
are defined. Indeed, the OciClientConfig component is abstract, and should not be used directly.
In the following sections we show what client extensions are available and their specific parameters.
Examples
from pyagentspec.llms import OciGenAiConfig
from pyagentspec.llms import LlmGenerationConfig
from pyagentspec.llms.ociclientconfig import OciClientConfigWithApiKey
# Get the list of available models from:
# https://docs.oracle.com/en-us/iaas/Content/generative-ai/deprecating.htm#
# under the "Model Retirement Dates (On-Demand Mode)" section.
OCIGENAI_MODEL_ID = "xai.grok-3"
# Typical service endpoint for OCI GenAI service inference
# <oci region> can be "us-chicago-1" and can also be found in your ~/.oci/config file
OCIGENAI_ENDPOINT = "https://inference.generativeai.<oci region>.oci.oraclecloud.com"
# <compartment_id> can be obtained from your personal OCI account (not the key config file).
# Please find it under "Identity > Compartments" on the OCI console website after logging in to your user account.
COMPARTMENT_ID = "ocid1.compartment.oc1..<compartment_id>"
generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.8)
llm = OciGenAiConfig(
name="oci-genai-grok3",
model_id=OCIGENAI_MODEL_ID,
compartment_id=COMPARTMENT_ID,
client_config=OciClientConfigWithApiKey(
name="client_config",
service_endpoint=OCIGENAI_ENDPOINT,
auth_file_location="~/.oci/config",
auth_profile="DEFAULT",
),
default_generation_parameters=generation_config,
)
OciClientConfigWithSecurityToken#
Client configuration that should be used if users want to use authentication through security token.
Parameters
- auth_file_location: str#
The location of the authentication file from which the authentication information should be retrieved. The default location is
~/.oci/config.
- auth_profile: str#
The name of the profile to use, among the ones defined in the authentication file. The default profile name is
DEFAULT.
OciClientConfigWithApiKey#
Client configuration that should be used if users want to use authentication with API key.
The parameters required are the same defined for the OciClientConfigWithSecurityToken.
OciClientConfigWithInstancePrincipal#
Client configuration that should be used if users want to use instance principal authentication. No additional parameters are required.
OciClientConfigWithResourcePrincipal#
Client configuration that should be used if users want to use resource principal authentication. No additional parameters are required.
OpenAiConfig#
OpenAI Models are powered by OpenAI.
You can refer to one of those models by using the OpenAiConfig Component.
Parameters
- model_id: str#
Name of the model to use.
- api_type: str#
The API type that should be used. Can be either
chat_completionsorresponses.
- api_key: str, null#
An optional api key for the authentication with the OpenAI endpoint.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
Important
Ensure that the OPENAI_API_KEY is set beforehand
to access this model. A list of available OpenAI models can be found at
the following link: OpenAI Models.
Examples
from pyagentspec.llms import OpenAiConfig
generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.7, top_p=0.9)
llm = OpenAiConfig(
name="openai-gpt-5",
model_id="gpt-5",
default_generation_parameters=generation_config,
api_key="optional_api_key",
)
GeminiConfig#
Gemini models can be configured through GeminiConfig.
Agent Spec supports both Google AI Studio and Google Vertex AI authentication modes.
Gemini authentication is modeled as a nested auth component, similar to OCI client_config.
The auth component itself remains inline during serialization. When api_key or
credentials is provided explicitly, only that sensitive field is externalized and must
be supplied through components_registry when loading the configuration back.
Parameters
- model_id: str#
Name of the model to use, for example
gemini-2.5-flashorgemini-2.0-flash-lite.
- auth: GeminiAuthConfig#
Required authentication component for Gemini. As with other Agent Spec components, auth configs need a
name. UseGeminiAIStudioAuthConfig(name="gemini-aistudio-auth")if you want runtimes to loadGEMINI_API_KEYfrom the environment, orGeminiVertexAIAuthConfig(name="gemini-vertex-auth", ...)for Vertex AI. The auth component remains inline when serialized. Ifapi_keyorcredentialsis set explicitly, only that sensitive field is serialized as a reference.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
Google AI Studio authentication#
Use GeminiAIStudioAuthConfig when connecting through Google AI Studio.
Parameters
- api_key: str, null#
Optional Gemini API key. If omitted, runtimes may load it from
GEMINI_API_KEY. If provided explicitly, only theapi_keyfield is externalized during serialization and must be supplied separately when deserializing.
Example
from pyagentspec.llms import GeminiConfig
from pyagentspec.llms.geminiauthconfig import GeminiAIStudioAuthConfig
generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.7, top_p=0.9)
llm = GeminiConfig(
name="gemini-aistudio-flash",
model_id="gemini-2.5-flash",
auth=GeminiAIStudioAuthConfig(
name="gemini-aistudio-auth"
# Optional: if api_key is omitted, runtimes may load GEMINI_API_KEY from the environment.
),
default_generation_parameters=generation_config,
)
Vertex AI authentication#
Use GeminiVertexAIAuthConfig when connecting through Google Vertex AI.
Parameters
- project_id: str, null#
Optional Google Cloud project identifier. In practice, you may still need to set this explicitly when ADC provides credentials but does not expose a default project.
- location: str#
Vertex AI location or region. Defaults to
global.
- credentials: str | dict, null#
Optional local file path (
str) to a Google Cloud JSON credential file, such as a service-account key file, or an inlinedictcontaining the parsed JSON contents of that file. When omitted, runtimes may rely on Google Application Default Credentials (ADC), such asGOOGLE_APPLICATION_CREDENTIALS, credentials made available through the local Google Cloud environment, or an attached service account. See Google Cloud authentication docs for details. This does not guarantee thatproject_idcan also be inferred automatically. If provided explicitly, only thecredentialsfield is externalized during serialization. Non-secret auth settings such asproject_idandlocationremain inline in the main config.
Example
from pyagentspec.llms.geminiauthconfig import GeminiVertexAIAuthConfig
generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.4, top_p=0.95)
llm = GeminiConfig(
name="gemini-vertex-flash",
model_id="gemini-2.0-flash-lite",
auth=GeminiVertexAIAuthConfig(
name="gemini-vertex-auth",
# Often still required even when ADC supplies the credentials.
project_id="my-gcp-project",
location="global",
# Optional: explicit credentials can be provided when ADC is not available.
),
default_generation_parameters=generation_config,
)
OpenAiCompatibleConfig#
OpenAI Compatible LLMs are all those models that are served through OpenAI APIs, either responses or completions.
The OpenAiCompatibleConfig allows users to use this type of models in their agents and flows.
Parameters
- model_id: str#
Name of the model to use.
- url: str#
Hostname and port of the server exposing the OpenAI-compatible endpoint.
- api_type: str#
The API type that should be used. Can be either
chat_completionsorresponses.
- api_key: str, null#
An optional api key if the remote server requires it.
- key_file: str, null#
Path to an optional client private key file in PEM format.
- cert_file: str, null#
Path to an optional client certificate chain file in PEM format.
- ca_file: str, null#
Path to an optional trusted CA certificate file in PEM format, used to verify the server.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
The certificate fields are useful when the remote endpoint is exposed over HTTPS with a private CA
or when it requires mutual TLS (mTLS). Like api_key, these values are treated as sensitive
fields during serialization.
Examples
from pyagentspec.llms import OpenAiCompatibleConfig
from pyagentspec.llms.openaicompatibleconfig import OpenAIAPIType
generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)
llm = OpenAiCompatibleConfig(
name="openai-compatible-llama-4-maverick",
model_id="llama-4-maverick",
url="https://url.to.my.openai.compatible.server/llama4mav",
api_type=OpenAIAPIType.RESPONSES,
api_key="optional_api_key",
key_file="/path/to/client.key",
cert_file="/path/to/client.pem",
ca_file="/path/to/ca.pem",
default_generation_parameters=generation_config,
)
VllmConfig#
vLLM Models are models hosted with a vLLM server.
The VllmConfig allows users to use this type of models in their agents and flows.
Parameters
- model_id: str#
Name of the model to use.
- url: str#
Hostname and port of the vLLM server where the model is hosted.
- api_type: str#
The API type that should be used. Can be either
chat_completionsorresponses.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
- api_key: str, null#
An optional api key if the remote vllm server requires it.
The VllmConfig inherits from OpenAiCompatibleConfig, so it also supports the optional
key_file, cert_file, and ca_file parameters for HTTPS and mTLS connections.
Examples
from pyagentspec.llms import VllmConfig
generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)
llm = VllmConfig(
name="vllm-llama-4-maverick",
model_id="llama-4-maverick",
url="http://url.to.my.vllm.server/llama4mav",
default_generation_parameters=generation_config,
api_key="optional_api_key",
)
OllamaConfig#
Ollama Models are powered by a locally hosted Ollama server.
The OllamaConfig allows users to use this type of models in their agents and flows.
Parameters
- model_id: str#
Name of the model to use.
- url: str#
Hostname and port of the vLLM server where the model is hosted.
- api_type: str#
The API type that should be used. Can be either
chat_completionsorresponses.
- default_generation_parameters: dict, null#
Default parameters for text generation with this model.
- api_key: str, null#
An optional api key if the ollama server requires it.
Examples
from pyagentspec.llms import OllamaConfig
generation_config = LlmGenerationConfig(max_tokens=512, temperature=0.9, top_p=0.9)
llm = OllamaConfig(
name="ollama-llama-4",
model_id="llama-4-maverick",
url="http://url.to.my.ollama.server/llama4mav",
default_generation_parameters=generation_config,
api_key="optional_api_key",
)
Recap#
This guide provides detailed descriptions of each model type supported by Agent Spec, demonstrating how to declare them using PyAgentSpec syntax.
Below is the complete code from this guide.
1from pyagentspec.llms import OciGenAiConfig
2from pyagentspec.llms import LlmGenerationConfig
3from pyagentspec.llms.ociclientconfig import OciClientConfigWithApiKey
4
5# Get the list of available models from:
6# https://docs.oracle.com/en-us/iaas/Content/generative-ai/deprecating.htm#
7# under the "Model Retirement Dates (On-Demand Mode)" section.
8OCIGENAI_MODEL_ID = "xai.grok-3"
9# Typical service endpoint for OCI GenAI service inference
10# <oci region> can be "us-chicago-1" and can also be found in your ~/.oci/config file
11OCIGENAI_ENDPOINT = "https://inference.generativeai.<oci region>.oci.oraclecloud.com"
12# <compartment_id> can be obtained from your personal OCI account (not the key config file).
13# Please find it under "Identity > Compartments" on the OCI console website after logging in to your user account.
14COMPARTMENT_ID = "ocid1.compartment.oc1..<compartment_id>"
15
16generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.8)
17
18llm = OciGenAiConfig(
19 name="oci-genai-grok3",
20 model_id=OCIGENAI_MODEL_ID,
21 compartment_id=COMPARTMENT_ID,
22 client_config=OciClientConfigWithApiKey(
23 name="client_config",
24 service_endpoint=OCIGENAI_ENDPOINT,
25 auth_file_location="~/.oci/config",
26 auth_profile="DEFAULT",
27 ),
28 default_generation_parameters=generation_config,
29)
30
31from pyagentspec.llms import OpenAiCompatibleConfig
32from pyagentspec.llms.openaicompatibleconfig import OpenAIAPIType
33
34generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)
35
36llm = OpenAiCompatibleConfig(
37 name="openai-compatible-llama-4-maverick",
38 model_id="llama-4-maverick",
39 url="https://url.to.my.openai.compatible.server/llama4mav",
40 api_type=OpenAIAPIType.RESPONSES,
41 api_key="optional_api_key",
42 key_file="/path/to/client.key",
43 cert_file="/path/to/client.pem",
44 ca_file="/path/to/ca.pem",
45 default_generation_parameters=generation_config,
46)
47
48from pyagentspec.llms import VllmConfig
49
50generation_config = LlmGenerationConfig(max_tokens=512, temperature=1.0, top_p=1.0)
51
52llm = VllmConfig(
53 name="vllm-llama-4-maverick",
54 model_id="llama-4-maverick",
55 url="http://url.to.my.vllm.server/llama4mav",
56 default_generation_parameters=generation_config,
57)
58
59from pyagentspec.llms import OpenAiConfig
60
61generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.7, top_p=0.9)
62
63llm = OpenAiConfig(
64 name="openai-gpt-5",
65 model_id="gpt-5",
66 default_generation_parameters=generation_config,
67)
68
69from pyagentspec.llms import GeminiConfig
70from pyagentspec.llms.geminiauthconfig import GeminiAIStudioAuthConfig, GeminiVertexAIAuthConfig
71
72generation_config = LlmGenerationConfig(max_tokens=256, temperature=0.7, top_p=0.9)
73
74llm = GeminiConfig(
75 name="gemini-aistudio-flash",
76 model_id="gemini-2.5-flash",
77 auth=GeminiAIStudioAuthConfig(
78 name="gemini-aistudio-auth"
79 # Optional: if api_key is omitted, runtimes may load GEMINI_API_KEY from the environment.
80 ),
81 default_generation_parameters=generation_config,
82)
83
84llm = GeminiConfig(
85 name="gemini-vertex-flash",
86 model_id="gemini-2.0-flash-lite",
87 auth=GeminiVertexAIAuthConfig(
88 name="gemini-vertex-auth",
89 # Often still required even when ADC supplies the credentials.
90 project_id="my-gcp-project",
91 location="global",
92 # Optional: explicit credentials can be provided when ADC is not available.
93 ),
94 default_generation_parameters=generation_config,
95)
96
97from pyagentspec.llms import OllamaConfig
98
99generation_config = LlmGenerationConfig(max_tokens=512, temperature=0.9, top_p=0.9)
100
101llm = OllamaConfig(
102 name="ollama-llama-4",
103 model_id="llama-4-maverick",
104 url="http://url.to.my.ollama.server/llama4mav",
105 default_generation_parameters=generation_config
106)
Next steps#
Having learned how to configure LLMs from different providers, you may now proceed to: