OpenClaw Cost Dashboard: Track Agent Spend by Job, Tool, and Model

← Back to Blog

Agentic Security

OpenClaw Cost Dashboard: Track Agent Spend by Job, Tool, and Model

Token costs in OpenClaw accumulate silently. A cron job that runs four times a day might consume 50,000 tokens per run across two model providers. A sub-agent that handles customer queries might spike to 200,000 tokens during peak hours. The agent does not know or care what the tokens cost; it optimizes for task completion. The bill arrives at the end of the month, and the question becomes: which jobs, which tools, and which models drove the spend?

This article describes the cost visibility problem in agentic AI, outlines a practical approach to solving it with event metadata that OpenClaw and its plugin ecosystem already produce, and explains where Zedly Shield is on the path to native cost tracking.

The Invisible Cost Problem

Agentic AI has a cost structure that is different from traditional API usage. In a standard LLM integration, you control the prompt, you send one request, and you get one response. The cost is predictable and directly attributable to the code that made the call. In an agentic workflow, the agent decides:

  • How many model calls to make: a simple task might take one call; a complex task might take ten, including tool-use loops where the agent calls a tool, reads the result, and calls the model again.
  • Which model to use: if your agent configuration supports multiple providers (OpenAI for reasoning, Anthropic for code, a local model for simple tasks), the cost per call varies by an order of magnitude.
  • How much context to include: the agent's context window grows as the conversation progresses. Later calls in a session include more tokens (all the prior messages, tool results, and memory) than earlier calls.
  • Which tools to invoke: tools like browser automation or code execution may trigger additional model calls internally (e.g., the model processes the browser's page content or the code execution output).

The result is that cost per session is variable, cost per job is unpredictable, and cost per tool is invisible. Without instrumentation, you are flying blind.

Where Cost Data Lives in OpenClaw

OpenClaw produces several data points relevant to cost tracking, but they are scattered across different subsystems:

Data source What it provides Limitation
Diagnostics telemetry llm_input and llm_output events with token counts (see diagnostics docs) Not aggregated; requires external processing to sum by session or job
Provider API responses Token usage in response headers (usage.prompt_tokens, usage.completion_tokens) Available per-call; not persisted by OpenClaw in a queryable format
Session metadata Session ID, creation time, model configuration Does not include cost or token fields
Plugin hooks (llm_input/llm_output) Access to the full request and response, including token usage Observational only; requires a plugin to capture and store the data

The pattern is familiar from cron run history and tool call history: the raw data exists, but assembling it into a useful view requires a capture-and-aggregate layer.

What a Cost Dashboard Should Show

A cost dashboard for agentic AI workflows needs to support three levels of granularity:

Per-job breakdown

Which cron jobs and interactive sessions are the most expensive? This view ranks sessions by total token cost, making it easy to spot runaway jobs. A weekly report that costs $0.50 per run is fine; one that costs $15 because the agent is re-reading the entire document tree every time is a problem worth fixing.

Per-model breakdown

If your agent uses multiple models, which model is driving the most spend? This view shows cost by provider and model (GPT-4o vs. Claude Sonnet vs. a local model), helping you evaluate whether the model routing is cost-effective. Maybe 80% of tasks could run on a cheaper model without quality loss. Marketing workflows are a prime example: trend monitoring, sentiment tracking, and content extraction run well on budget models at a fraction of the cost.

Per-tool breakdown

Which tools trigger the most follow-up model calls? A read that loads a 10,000-line file dumps that content into the context window, inflating the token count for every subsequent call. A tool call timeline correlated with token costs shows which tools are the most expensive in practice.

Cost is not just tokens. Some tools incur non-token costs: API calls to external services (search, geocoding, payment APIs), compute time for code execution sandboxes, and storage costs for files written to cloud storage. A complete cost view accounts for both token costs (from model calls) and operational costs (from tool execution). Token costs are easier to track because the data is in the API response; operational costs require per-tool instrumentation.

Building Cost Visibility from Event Metadata

If you already have a tool call audit log capturing events with session IDs and tool names, adding cost data is an incremental step:

  1. Capture token counts from llm_input/llm_output hooks or provider response headers. Add inputTokens, outputTokens, and model fields to the event schema.
  2. Map tokens to cost using a pricing table (per-model, per-token-type rates). This table needs to be maintained as providers update pricing.
  3. Aggregate by session to produce per-run cost summaries. This extends the existing run summary with a cost column.
  4. Aggregate by model and tool to produce the per-model and per-tool breakdowns.
  5. Set up alerting for sessions that exceed a cost threshold. A cron job that normally costs $0.50 but suddenly costs $5.00 should trigger a notification.

The event stream is the backbone. Cost data is just another dimension layered on top of the same events that power your audit log, run history, and tool call dashboard. This is why investing in a structured event pipeline pays compounding returns: each new view is an aggregation of existing data, not a new data collection effort.

Where Zedly Shield Stands Today

Shield does not track token costs today. What it does provide is the foundation that makes cost tracking a natural next step rather than a greenfield build.

Here is what Shield already delivers:

  • Session-scoped events: every tool call, policy decision, and lifecycle event is tagged with a session ID. This is the grouping key that per-job cost aggregation will use.
  • Runs tab with per-session aggregations: the Runs tab already computes duration, tool count, event count, and block count for each session. Adding a cost column is an incremental enrichment to an existing view, not a new pipeline.
  • Multi-instance fleet visibility: for teams running OpenClaw across multiple deployments, the dashboard aggregates events from all instances. When cost data arrives, it follows the same aggregation path for fleet-wide spend visibility.

What is coming next: Token-level cost tracking is the next major ops expansion on Shield's roadmap. The plan is to hook into OpenClaw's llm_input / llm_output diagnostics, capture model identity and token counts per call, and surface per-run and per-model cost breakdowns in the dashboard. The event schema, forwarding pipeline, and dashboard rendering are all designed to accept new metric fields without architectural changes.

In the meantime, Shield's event stream can be exported to external analytics tools (pandas, a data warehouse, or a BI dashboard) where you can join event data with provider billing data to produce cost reports today. The immutable event log ensures the data you analyze is trustworthy. For practical strategies on reducing agent costs through model routing (assigning budget models to high-volume tasks), see our guide to OpenClaw for enterprise.

Start With Visibility, Add Cost Tracking When It Ships

Zedly Shield already gives you session-level visibility into tool calls, policy blocks, and run history across your OpenClaw fleet. Cost tracking is the next feature on the roadmap. Install the free plugin today so the event pipeline is in place when per-model spend breakdowns arrive.

Explore Zedly Shield

Frequently Asked Questions

Does OpenClaw track token costs automatically?

OpenClaw tracks some token usage through its diagnostics telemetry, including llm_input and llm_output events that include token counts. However, this data is not aggregated into a cost view. Token counts are available per-call, but mapping them to dollar amounts requires knowing the provider's pricing for the specific model and token type (input vs. output). A cost dashboard performs this mapping and aggregation.

How do I calculate cost per cron job?

You need two data sources: the token usage per LLM call (from OpenClaw's diagnostics or from the provider's API response headers), and the session grouping that maps LLM calls to cron jobs (from the session ID). If you are already capturing events with a tool call audit log, adding token counts to those events gives you the per-session aggregation. Multiply token counts by the model's per-token rate to get cost.

Which OpenClaw costs are hardest to track?

Tool execution costs that are not token-based: API calls to external services with their own billing (like search APIs, geocoding, or payment processors), compute time for long-running code execution, and storage costs for files written to disk or cloud storage. These costs are incurred by tool calls, not by model calls, so they require a different tracking mechanism than token counting. A tool call history dashboard helps identify these indirect costs.

Can I set cost alerts or budgets?

With event-level cost data, alerting is straightforward: sum the cost per session (or per time window) and trigger an alert when it exceeds a threshold. Budgets are harder because they require forecasting: if the current run is 60% through and has consumed 80% of its budget, should you stop it? The event stream gives you the data for both alerting (reactive) and budgeting (proactive), but the logic for budget enforcement is application-specific.

Does Zedly Shield track costs today?

Not yet. Shield currently captures tool call events, session metadata, and run summaries, but it does not capture token counts or model identity. Token-level cost tracking is the next major ops feature on the roadmap. The good news is that all of the infrastructure — session-scoped events, the Runs tab with per-session aggregations, and the dashboard — is already in place. When token data is added to the event schema, cost breakdowns will flow through the same pipeline with no architectural changes.

Ready to get started?

Runtime safety for agentic AI. PII redaction, policy-based blocking, and tamper-evident audit logs for OpenClaw.