Workforce operations
Runbook for the task-ops automation pack — rollout waves, the kill switch, symptom→action table, and where to look when agent automation misbehaves.
3 min read
This is the operator's page for the AI-workforce automation. The product surfaces live in the app (Agents → Workforce is the operational home); this page covers rollout, the kill switch, and diagnosis.
Rollout waves
The pack provisions per organization with sticky opt-outs (org edits always win; a re-run never re-activates triggers an admin disabled):
- Wave 0 — dogfood: enable for an internal organization (
provisioning/provision_task_ops_packwith activation), watch the Workforce health strip for a week. Exit: zero unacknowledged automation failures. - Wave 1 — provisioned, off: all organizations receive the pack inactive; admins self-serve via the Workforce master toggle. Exit: health strip clean for 72 h across enabled orgs.
- Wave 2 — default on: enable by default for organizations that never toggled.
The kill switch
Time to quiet: under two minutes. Switching the master toggle off (Agents → Workforce, admin-only, audited) — or running the workflows/ops/disable_task_ops_pack operation — does two things: deactivates every pack trigger row (the scheduler stops scanning them) and gates the run path itself (task_automation policy), so even an already-queued event refuses to start agent work. In-flight runs finish; nothing new starts.
Verification: the Workforce page shows automation off; new task assignments produce no acknowledgment comment; [TaskOpsKillSwitch] appears in the backend logs.
Symptom → action
| Symptom | Likely cause | Action |
|---|---|---|
| Tasks assigned to agents do nothing | Automation off, or pack triggers inactive | Workforce toggle; Automations → check tasks/… triggers are active |
| Work stuck In progress > 24 h | Agent run died; stale sweep will roll it back | Check the task's run history; the hourly sweep returns it to To do |
| Agent refuses every run | Budget pause or circuit breaker | Workforce → needs-attention queues; raise budget or change task status as a human |
| Reviews pile up | Reviewers missing the inbox | Review reminders escalate at 4 h / 24 h automatically; check Inbox preferences |
External runs always fail runtime_offline | No daemon online for the adapter | Settings → API → Runtimes: daemon status; tale daemon status on the machine |
Digest reports capped numbers | Very high activity day hit a scan cap | Numbers are lower bounds; no action needed |
Where to look
- Logs: stable bracket prefixes —
[AgentTaskRun],[ExternalRuns],[WorkforceRollup],[TaskOpsKillSwitch],[RetentionCleanup]— with key=value fields. Prompt bodies are never logged. - Per-task trace: the task sheet's run history links each run to its automation execution.
- Data lifecycle: run records honor the
agentRunsretention category (default 180 days, org-tunable 30–400); aggregates are pruned at a fixed 400 days; review decisions survive subject erasure pseudonymized.