Chats and Messages
Chats and Messages are the primary way to interact with LLMs in Rigging.
Chats and Messages
Chat
objects hold a sequence of Message
objects pre and post generation. This is the most common way that we interact with LLMs, and the interface of both these and ChatPipeline
’s are very flexible objects that let you tune the generation process, gather structured outputs, validate parsing, perform text replacements, serialize and deserialize, fork conversations, etc.
Basic Usage
Templating (apply)
You can use both ChatPipeline.apply()
and ChatPipeline.apply_to_all()
to swap values prefixed with $
characters inside message contents for fast templating support. This functionality uses string.Template.safe_substitute underneath.
Parsed Parts
Message objects hold all of their parsed [ParsedMessagePart
][rigging.message.ParsedMessagePart]‘s inside their [.parts
][rigging.chat.Message.parts] property. These parts maintain both the instance of the parsed Rigging model object and a [.slice_
][rigging.message.ParsedMessagePart.slice_] property that defines exactly where in the message content they are located.
Every time parsing occurs, these parts are re-synced by using [.to_pretty_xml()
][rigging.model.Model.to_pretty_xml] on the model, and stitching the clean content back into the message, fixing any other slices which might have been affected by the operation, and ordering the [.parts
][rigging.chat.Message.parts] property based on where they occur in the message content.
- Notice how our message content got updated to reflect fixing the the extra whitespace in our start tag and our string stripping annotation.
Stripping Parts
Because we track exactly where a parsed model is inside a message, we can cleanly remove just that portion from the content and re-sync the other parts to align with the new content. This is helpful for removing context from a conversation that you might not want there for future generations. This is a very powerful primitive, that allows you to operate on messages more like a collection of structured models than raw text.
Metadata
Both Chats and ChatPipelines support the concept of arbitrary metadata that you can use to store things like tags, metrics, and supporting data for storage, sorting, and filtering.
ChatPipeline.meta()
adds toChatPipeline.metadata
Chat.meta()
adds toChat.metadata
Metadata will carry forward from a ChatPipeline to a Chat object when generation completes. This metadata is also maintained in the serialization process.
Generation Context and Additional Data
Chats maintain some additional data to understand more about the generation process:
Chat.stop_reason
Chat.usage
Chat.extra
It’s the responsibility of the generator to populate these fields, and their content will vary dependent on the underlying implementation. For instance, the transformers
generator doesn’t provide any usage information and the vllm
generator will add metrics information to the extra
field.
We intentionally keep these fields as generic as possible to allow for future expansion. You’ll often find deep information about the generation process in the [Chat.extra
][rigging.chat.Chat.extra] field.