Tasks
Execution flows and work inside Runs
Tasks are a fundamental building block in Strikes that help structure and track your code execution. Tasks are a very powerful primitive that exist inside runs and let you scope inputs, outputs, and metrics to a smaller unit of work. Tasks keep track of when and where they are called within eachother, and inside the run. You can write your code the way you’d like and Strikes will track the flow.
We’ll cover some advanced use cases, but using tasks works just like functions, and should feel familiar to any workflow framework you’ve used. You might use tasks to represent one of your agents, data-loading code, tool call, or the processing of a sample batch from a dataset.
What is a Task?
In Strikes, a task is a unit of work with:
- A well-defined input/output contract
- Tracing with execution time and relationships to other tasks
- The ability to scope and report metrics
- Storage for input and output objects
Creating Tasks
The most common way to create a task is by decorating a function:
Once decorated, your function will automatically:
- Track its execution time
- Store its input arguments
- Store its return value
- Create spans in the OpenTelemetry trace
For when you need more flexible task boundaries or don’t want to refactor existing code, you can use the task span context manager:
This approach gives you more control over when the task starts and ends, and lets you manually log inputs and outputs.
Task Configuration
Tasks can be configured with several options:
Working with Task Results
When you call a task, you either get the result of the task (task()
) or a TaskSpan
object (task.run()
) that provides access to the task’s context, metrics, and output. You can get the raw TaskSpan
object by calling .run()
on the task.
Logging Data within Tasks
Within tasks, you can explicitly log data using several methods:
Data logged within a task is automatically associated with that task’s span, making it easy to track the flow of data through your system.
Task Execution Patterns
Tasks support several execution patterns to handle different workflows:
Error Handling
Any task that raises an exception will be marked as failed in the UI. You can handle errors using the try_()
and try_map()
methods:
Measuring Task Performance
One of the most powerful features of tasks is their ability to measure and track performance:
When the task runs, the scorer will automatically evaluate the output and log a metric with the score.
Finding the Best Results
Tasks also provide methods to filter and sort results based on metrics:
This pattern is particularly useful for generative tasks where you want to generate multiple candidates and pick the best ones.
Understanding Labels
Every task in Strikes has both a name and a label:
How Labels Work
Labels play an important role in organizing and identifying metrics within your tasks, as outlined below:
- Default Derivation: If you don’t specify a label, it’s automatically derived from the function name by converting it to lowercase and replacing spaces with underscores.
- Label Usage: Labels are used internally to:
- Prefix metrics logged within the task
- Create namespaces for data organization
- Enable filtering in the UI and exports
Label Impact on Data
The most important thing to understand about labels is how they affect metrics:
When this task logs a metric named token_count
, that metric is:
- Stored with the task span as
token_count
- Mirrored at the run level with the prefix
tokenize.token_count
Best Practices
- Keep tasks focused: Each task should do one thing well, making it easier to trace and debug.
- Use meaningful names: Task names appear in the UI, so make them human-readable.
- Log relevant data: Be intentional about what you log as inputs, outputs, and metrics.
- Handle errors appropriately: Use
try_run()
and similar methods to handle task failures gracefully. - Use tasks to structure your code: Tasks help create natural boundaries in your application.
- Combine with Rigging tools: Tasks work seamlessly with Rigging tools for LLM agents.