SP Agent Team Token Report — Week of 2026-05-24

---
title: "Weekly Token Optimization Report: The Opus Plateau"
date: "2026-05-24"
category: "Engineering"
---

## This Week's Numbers

Our token consumption metrics for the week of May 18 to May 24 show a significant shift in model distribution. We processed **397 new turns** across **24 sessions**, with a heavy concentration of activity in the first half of the week.

| Metric | Value | Status |
| :--- | :--- | :--- |
| **Weekly Avg Opus%** | 51% | 🟡 Yellow |
| **Target Opus%** | < 50% | - |
| **New Project Files** | 29 | - |
| **Hermes Sessions** | 78 | ✅ Active |

## What Changed

The most striking trend this week is the sharp decline in Claude 3 Opus utilization. On May 18th, Opus handled 69% of turns (148 turns). However, from May 20th through May 24th, **Opus usage dropped to 0%**, with all requests being routed to Sonnet. 

Despite this late-week efficiency, the heavy load from the first two days has kept our weekly average at **51%**, leaving us just above our target threshold.

## Wins

- **Successful Model Migration**: The complete shift to Sonnet between May 20-24 demonstrates that our agentic workflows are becoming less dependent on the "heavy" Opus model for daily operations.
- **Hermes Stability**: The Hermes Agent (utilizing `qwen3-235b`) has successfully internalized all `cron` platform tasks, maintaining a consistent rhythm of 11-12 sessions per day with zero failures.

## Challenges

The "Opus Plateau" is our primary hurdle. Because we are averaging 51%, we are seeing that a few high-intensity days of architectural planning or complex debugging (like those on May 18th) can skew the entire week's efficiency metrics. We need to find a way to decompose those complex "Opus-tier" tasks into smaller, Sonnet-capable sub-tasks.

## Next Week's Target

- **Primary Goal**: Bring Weekly Avg Opus% down to **< 45%**.
- **Secondary Goal**: Increase Hermes utilization for non-cron research tasks.

## Dispatch Optimization

To hit our targets, we will shift the following dispatch logic from Claude to Hermes:
1. **Repetitive Log Analysis**: Move all automated system health checks from Sonnet to Hermes.
2. **Initial Research Drafting**: Use Hermes to generate the first pass of documentation before routing to Sonnet for refinement.
3. **Cron-based Triggering**: Further expand the `cron` platform to handle more agentic polling.

## Cost Savings

By leveraging Hermes (powered by DeepSeek-tier pricing) for our 78 weekly automated sessions instead of Claude 3 Sonnet, we are realizing significant margins.

- **Hermes Cost**: $0.003 / query $\times$ 78 queries $\approx$ **$0.23**
- **Est. Sonnet Cost**: Assuming an average of $0.01 per agentic turn $\times$ 78 $\approx$ **$0.78**
- **Weekly Savings**: While the absolute dollar amount is small at this scale, the **~70% reduction in cost per session** validates the strategy as we scale to thousands of sessions.

## Recommendations

Based on the current data trends, the team should implement the following:

1. **Route more implementation to Sonnet**: Since Opus% is currently 51% (> 50%), we must strictly enforce Sonnet for all code implementation tasks.
2. **Expand Hermes Scope**: While Hermes is stable on `cron` tasks, it is currently specialized. We should move more general research tasks to Hermes to further decrease Claude dependency.
3. **Audit High-Turn Days**: Analyze the May 18th session (148 Opus turns) to determine if those tasks could have been handled by a multi-agent Sonnet swarm.
SP Agent Team Token Report — Week of 2026-05-24

相關文章

分類

指令面板

選擇主題