The core of generating messages and text.
get_generator
function. The base interface is flexible, and designed to support optimizations should the underlying mechanisms support it (batching async, K/V cache, etc.)
provider
maps to a particular subclass of Generator
(optional).model
is a any str
value, typically used by the provider to indicate a specific LLM to target.kwargs
are used to carry:
,api_key=...
) or the base URL (,api_base=...
) for the model provider.GenerateParams
fields like like temp, stop tokens, etc.LiteLLMGenerator.max_connections
property by passing ,max_connections=
in the identifier string.litellm
/LiteLLMGenerator
by default.
You can view the LiteLLM docs for more information about supported model providers and parameters.
Building generators from string identifiers is optional, but a convenient way to represent complex LLM configurations.
to_identifier
or get_identifier
..api_key
attribute which can be set directly, or by passing ,api_key=
as part of an identifier string. Not all generators will require one, but they are common enough that we include the attribute as part of the base class.
Typically you will be using a library like LiteLLM underneath, and can simply use environment variables:
LiteLLMGenerator.max_connections
LiteLLMGenerator.min_delay_between_requests
ChatPipeline.wrap()
with a library like backoff to catch many, or specific errors, like rate limits or general connection issues.
vLLM
and transformers
generators for loading and running models directly in the same Python process. In general vLLM is more consistent with Rigging’s preferred API, but the dependency requirements are heavier.
Where needed, you can wrap an existing model into a rigging generator by using the VLLMGenerator.from_obj()
or TransformersGenerator.from_obj()
methods. These are helpful for any picky model construction that might not play well with our rigging constructors.
vllm
and transformers
packages to be installed. You can use rigging[all]
to install them all at once, or pick your preferred package individually.Generator.load
and Generator.unload
methods to better control memory usage. Local providers typically are lazy and load the model into memory only when first needed.openai/
LiteLLM prefix (usually just openai/<model>,api_base=http://...,api_key=...
).
qwen3:0.6b
model from Ollama, and the ollama server will host the model on http://localhost:11434
by default.
ollama/
or ollama_chat/
prefixes:
api_base
to the generator:vllm serve
command. LiteLLM uses the hosted_vllm/
prefix to connect there, otherwise you can use the openai/
prefix noted below.
openai/
prefix for LiteLLM as noted in their docs:
hosted_vllm
or llamafile
.openai/
prefix along with api_key=
and api_base=
for vLLM:CompletionPipeline
and ChatPipeline
, you can overload and update any generation params by using the associated .with_()
function.
HTTPGenerator
allows you to wrap any HTTP endpoint as a generator, making it easy to integrate external LLMs or AI services into your Rigging pipelines. It works by defining a specification that maps message content into HTTP requests and parses responses back into messages.
The specification is assigned to the .spec
field on the generator, and can be applied as a Python dictionary, JSON string, YAML string, or base64 encoded JSON/YAML string.
This flexibility allows you to easily share and reuse specifications across different parts of your application.
HTTPGenerator.for_json_endpoint()
HTTPGenerator.for_text_endpoint()
HTTPSpec
HTTPSpec
object yourself. This is useful for complex scenarios involving multi-step transformations or other non-standard requirements.
.model
field on the generator to carry our crucible challenge
to_identifier
. This also means that when we save our chats to storage, they maintain their http specification.HTTPGenerator
stateful by using the .state
dictionary. This is a mutable dictionary that you can use to store any dynamic information—like session IDs or temporary credentials—that needs to be accessed by your templates.
state
dictionary with an async hook
function. The hook is called after every HTTP request, allowing it to inspect the response and dynamically update the state before automatically retrying.
spec
, the following RequestTransformContext
variables are available in your Jinja templates (e.g., {{ variable }}
) and JSON value substitutions (e.g., "$variable"
):
content
: Content of the last message.messages
: List of all message objects in the history.api_key
: The API key from the generator’s configuration.model
: The model identifier from the generator’s configuration.state
: The generator’s mutable state
dictionary (see Stateful Generators).params
: Generation parameters from the current call (e.g., temperature
).all_content
: Concatenated content of all messages.role
: Role of the last message (user/assistant/system).spec
files with multiple transformation steps, the output of the previous step is also available to the next as result
, data
, or output
.
jinja
transform type provides full Jinja2 template syntax. Access context variables directly
and use Jinja2 filters and control structures.
json
transform type lets you build JSON request bodies using a template object. Use $
prefix
to reference context variables, with dot notation for nested access:
jsonpath
transform type uses JSONPath expressions to extract data from JSON responses:
regex
transform type uses regular expressions to extract content from text responses:
Generator
base class, and can elect to implement handlers for messages and/or texts:
async def generate_messages(...)
- Used for ChatPipeline.run
variants.async def generate_texts(...)
- Used for CompletionPipeline.run
variants.NotImplementedError
for you. It’s currently undecided whether generators should prefer to provide weak overloads for compatibility, or whether they should ignore methods which can’t be used optimally to help provide clarity to the user about capability. You’ll find we’ve opted for the former strategy in our generators.api_key
, model
, and generation params are common enough that they are included in the base class.register_generator
method to add your generator class under a custom providerid so it can be used with get_generator
.