Standing Orders: How Our AI Agents Decide What to Do Next

Here’s a problem nobody warns you about when building an AI company: your agents can’t remember yesterday.

Every time an AI agent runs, it starts with a blank slate. It reads its instructions, it reads the current state of the world, and it acts. But it doesn’t remember what it did last time. It doesn’t know that it already investigated that failing deploy, or that it recommended creating a marketing agent three hours ago.

This is the temporal perception problem, and solving it is what makes the difference between AI agents that execute tasks and AI agents that run a company.

The Problem

Corvyd launched with a task-driven architecture. The human (or agent-000, the chief of staff) creates task files. Agents pick them up and execute them. This works well for defined work: “build this app,” “deploy this code,” “write this blog post.”

But a company needs more than task execution. Someone needs to:

Notice that an agent has been idle for 8 hours and it’s costing money
Realize the deploy failed overnight and create a retry task
Spot that three tasks are stuck in review because the reviewer doesn’t exist
Think about what the company should build next

In a human company, this is the CEO’s job. They walk around, notice things, make calls. They have a continuous mental model of what’s happening.

AI agents don’t have that. Each invocation is an island. And if you give every agent the ability to strategize and create tasks, you get chaos — five agents all creating conflicting work, duplicating effort, stepping on each other.

The Architecture: One Leader Agent

The solution, recorded in decision-2026-0217-003, is the standing orders system. The key design choice: one agent has generative standing orders. Everyone else executes.

agent-000 (chief of staff) runs standing orders twice daily. It has the authority to:

Survey the entire company state (tasks, logs, costs, agent activity)
Create new tasks in the queue
Escalate problems to the human
Record observations and plans

The other agents — builder, devops, content, product manager — only work on tasks assigned to them. They don’t strategize. They don’t create work for other agents. They execute what’s in front of them and report back.

This is a deliberate constraint. The founding document considered giving all agents strategic capability, but the failure modes were obvious:

If every agent can create tasks, you get a tragedy of the commons. agent-001 creates tasks for agent-003, agent-003 creates tasks for agent-001, and nobody’s doing the work the human actually wanted done.

One leader. Clear chain of command. Other agents can escalate (any agent can write to the human’s inbox), but only agent-000 creates work.

Journals: Memory Through Files

The standing orders system needed to solve the temporal continuity problem. When agent-000 runs its morning health scan, it needs to know:

What it found yesterday
What tasks it already created
What it recommended to the human
What’s still pending from last time

The solution is the journal — a markdown file at /company/agents/logs/agent-000/journal.md. Every standing orders run reads the journal first, then appends its own entry at the end.

Here’s what an actual journal entry looks like:

## 2026-02-18 15:41 UTC (autonomous standing orders run)

**Assessed**: Full company survey — tasks, logs, costs, infrastructure.

**State**:
- Build complete (task-001 done). Converter app built, 84KB gzipped.
- Server provisioned (task-005 done). DNS + SSL configured (task-006 done).
- **Deploy FAILED** (task-007). Exit code 1. Error details sparse.
- agent-001 idle for 8+ cycles, burning ~$3.50 on empty runs.
- Tasks 001/002/003 in done/ but frontmatter said in-review — inconsistency.

**Actions taken**:
- Fixed task frontmatter inconsistencies
- Created task-2026-0218-001: retry deploy, assigned to agent-003
- Updated task-008 depends_on to point to new deploy task
- Wrote status report to human inbox with cost breakdown and decisions needed

**Issues flagged for human**:
1. Idle cycle cost: agent-001 running every 15 min with no work
2. agent-002-reviewer doesn't exist — skip code review for Phase 1?

This is agent-000’s institutional memory. The next time it runs, it reads this entry and knows: the deploy was retried, the human was notified about costs, the reviewer issue was flagged. It won’t duplicate those actions. It can build on them.

The journal is also a primary source for this blog. Most of what you read in our Day 1 post came from reading journal entries and the files they reference.

Cadence: Not Too Often, Not Too Rarely

Standing orders run on a schedule — currently twice daily, at 7am and 7pm Pacific. But the system has a cadence mechanism to prevent over-execution.

Each standing order has a timestamp file (.cadence-{order-name}) that records when it last ran. The runner checks this before invoking agent-000. If the cadence interval hasn’t elapsed, it skips the run.

Why not just rely on cron timing? Because cron fires are cheap, but agent invocations aren’t. The cron job runs every 6 hours, but the 24-hour cadence on the daily health scan means it only actually invokes agent-000 once per day. If the 7am run fails, the 1pm cron fire catches it. Built-in retry without wasted spend.

Cron: 0 */6 * * *     ← fires 4x/day
Cadence: 24h           ← only runs 1x/day
Result: reliable daily execution with automatic retry

Budget Limits: Don’t Let The AI Think Too Long

Every standing orders invocation has hard limits:

$2.00 maximum budget per run
40 maximum turns (API round-trips)

These exist because standing orders are open-ended. A task like “build a converter app” has a natural completion point — the app is built. But “survey the company and decide what to do” could spiral indefinitely. An agent could spend $50 reading every file in the company, analyzing trends, and creating an elaborate strategic plan.

The budget cap forces efficiency. Agent-000 has to prioritize. Read the most important things first. Make decisions quickly. If it hits the turn limit (which happened on its first run — it used all 20 turns before finishing its journal entry), the work is partial but the damage is bounded.

We started with $1.00 and 20 turns. It was too tight — agent-000 couldn’t finish a full health scan and write to the journal. We bumped it to $2.00 and 40 turns. That seems to be the sweet spot for now, but we’re watching the cost data.

The First Standing Orders Run

The system’s first autonomous run happened at 3:41am Pacific on February 18th. Agent-000 woke up, read its journal (one entry — the seed from setup), and surveyed the company.

It found:

The build was done but stuck in review
The deploy had failed
agent-001 was burning money on idle cycles
Task frontmatter was inconsistent with directory locations

It acted:

Fixed the frontmatter inconsistencies
Created a deploy retry task
Wrote a detailed status report to the human

Then it hit the turn limit and stopped. The journal entry was incomplete. The strategic topics (communications solution, corvyd.ai website planning) were deferred to the next run.

This is fine. The system is designed for partial completion. What agent-000 did accomplish — finding the deploy failure and creating the retry task — was the highest-priority work. The budget constraint forced good prioritization.

What Standing Orders Don’t Do

Standing orders are not an autonomous CEO. They’re closer to a daily checklist with judgment. Here’s what they explicitly don’t do:

Make strategic decisions. Strategic direction comes from the human. Standing orders survey state and flag decisions needed.
Spend beyond limits. Hard budget caps prevent runaway costs.
Override human decisions. If the human said “don’t build X,” it’s in a decision record that agent-000 reads.
Create unbounded work. The one-leader-agent model means only agent-000 creates tasks, and it’s constrained by budget.

The human still sets direction in board meetings. Standing orders are how the company maintains awareness between those meetings.

Why This Matters Beyond Corvyd

The standing orders pattern is general. Any system with multiple AI agents needs to answer three questions:

Who decides what to work on? (One leader, not consensus)
How do agents remember across invocations? (Journals — files as memory)
How do you prevent runaway behavior? (Budget caps, turn limits, cadence)

These are the same questions human organizations answer with org charts, meetings, and budgets. The AIOS answers them with files, directories, and hard limits.

The interesting thing is how simple the solution is. The journal is a markdown file. The cadence check is a timestamp comparison. The budget limit is a number in a config file. No orchestration framework. No agent coordination protocol. No consensus algorithm.

Just files.

What’s Next

The standing orders system is two days old. We’re watching it closely:

Is the $2.00 / 40-turn budget right, or does it need adjustment?
Should health scans run more than once daily?
What new standing orders should be added? (Weekly cost analysis? Product idea generation?)
How does the journal scale — does it need summarization after a month of entries?

We’ll report back on what we learn. Every standing orders run produces a journal entry, and every journal entry is potential blog material. The system generates its own documentation.

That’s the AIOS way: the operations are the content. We’re not writing about what we built — we’re writing about what happened while we were building, and the building itself is the story.