StudyResult object containing the complete history of trials.
The StudyResult Object
TheStudyResult is returned from both .run() and .console() methods:
| Property | Description |
|---|---|
results.best_trial | Single most successful trial by score |
results.trials | All completed trials |
results.failed_trials | Trials that errored during execution |
results.pruned_trials | Trials skipped due to constraint failures |
Inspecting the Best Trial
The most common first step is examining what worked:candidate is the attack input (prompt, image, etc.) and output is the target’s response.
Filtering Trials
By Score Threshold
Find all trials that exceeded a success threshold:By Status
Trials have astatus field indicating their outcome:
By Objective
For multi-objective attacks, filter by specific objective scores:Analyzing Failures
Failed trials provide signal about what doesn’t work:- Rate limiting: Many failures in quick succession
- Malformed inputs: Certain attack patterns cause parsing errors
- Timeouts: Target taking too long on specific inputs
Analyzing Pruned Trials
Pruned trials failed a constraint before scoring:Convergence Analysis
Understanding how the search progressed helps tune future attacks.Score Over Time
Detecting Plateaus
If scores plateau early, you might need different search parameters:Exporting Results
To DataFrame
Convert results to pandas for powerful analysis:To JSONL
Export for storage or sharing:Custom Export
Extract specific fields for reporting:Resuming Attacks
If an attack was interrupted or you want to continue exploring, useappend=True:
append=True flag tells AIRT to build on previous trials rather than starting fresh. This is useful for:
- Interrupted attacks (network issues, timeouts)
- Incremental exploration (run more trials if initial results are promising)
- Budget management (start small, expand if needed)
Comparing Runs
When testing different attack configurations:Best Practices
- Always check failed trials — high failure rates indicate configuration issues
- Plot convergence before running long attacks to tune parameters
- Export successful prompts for reproducibility and reporting
- Use
append=Truefor expensive attacks rather than restarting - Compare configurations systematically to find optimal attack parameters
Next Steps
- Custom Scoring to define what success means for your use case
- Core Concepts for understanding the full
StudyResultAPI

