CE

ClawExplorer

OpenClaw skill

codex-orchestration

An OpenClaw skill that enables agents to orchestrate multiple Codex agents for solving complex, multi-step problems. It provides tools for spawning Codex sub-agents, delegating subtasks, facilitating inter-agent communication, and aggregating results.

Files

Review the files below to add this skill to your agents.

Security notice: review the SKILL.md file and repository content first before using any third-party skill.

SKILL.md content

---
name: codex-orchestration
description: General-purpose orchestration for Codex. Uses update_plan plus background PTY terminals to run parallel codex exec workers.
---

# Codex orchestration

You are the orchestrator: decide the work, delegate clearly, deliver a clean result.
Workers do the legwork; you own judgement.

This guide is steering, not bureaucracy. Use common sense. If something is simple, just do it.

## Default assumptions
- YOLO config (no approvals); web search enabled.
- PTY execution available via `exec_command` and `write_stdin`.
- Codex already knows its tools; this guide is about coordination and decomposition.

## Two modes

### Orchestrator mode (default)
- Split work into sensible tracks.
- Use parallel workers when it helps.
- Keep the main thread for synthesis, decisions, and final output.

### Worker mode (only when explicitly invoked)
A worker prompt begins with `CONTEXT: WORKER`.
- Do only the assigned task.
- Do not spawn other workers.
- Report back crisply with evidence.

## Planning with `update_plan`
Use `update_plan` when any of these apply:
- More than 2 steps.
- Parallel work would help.
- The situation is unclear, messy, or high stakes.

Keep it light:
- 3 to 6 steps max.
- Short steps, one sentence each.
- Exactly one step `in_progress`.
- Update the plan when you complete a step or change direction.
- Skip the plan entirely for trivial tasks.

## Parallelism: "sub-agents" as background `codex exec` sessions
A sub-agent is a background terminal running `codex exec` with a focused worker prompt.

Use parallel workers for:
- Scouting and mapping (where things are, current state)
- Independent reviews (different lenses on the same artefact)
- Web research (sources, definitions, comparisons)
- Long-running checks (tests, builds, analyses, data pipelines)
- Drafting alternatives (outlines, rewrites, options)

Avoid parallel workers that edit the same artefact. Default rule: many readers, one writer.

## Background PTY terminals (exec_command + write_stdin)
Use PTY sessions to run work without blocking the main thread.

- `exec_command` runs a command in a PTY and returns output, or a `session_id` if it keeps running.
- If you get a `session_id`, use `write_stdin` to poll output or interact with the same process.

Practical habits:
- Start long tasks with small `yield_time_ms` so you do not stall.
- Keep `max_output_tokens` modest, then poll again.
- Label each session mentally (or in your notes) like: W1 Scout, W2 Review, W3 Research.
- Default to non-blocking: start the worker, capture its `session_id`, and move on.
- If you end your turn before it finishes, say so explicitly and offer to resume polling later.
- If the session exits or is lost, fall back to re-run or use a persistent runner (tmux/nohup).
- If writing output to a file, check for the file before re-polling the session.

Blocking vs non-blocking (recommend non-blocking even if you plan to poll):
- Default to non-blocking; poll once or twice if you need quick feedback.
- Blocking is fine only for short, predictable tasks (<30–60s).

Stopping jobs:
- Prefer graceful shutdown when possible.
- If needed, send Ctrl+C via `write_stdin`.

## Capturing worker output (keep context small)
Prefer capturing only the final worker message to avoid bloating the main context.

Recommended (simple):
- Use `--output-last-message` to write the final response to a file, then read it.
- Example: `codex exec --skip-git-repo-check --output-last-message /tmp/w1.txt "CONTEXT: WORKER ..."`
- If you are outside a git repo, add `--skip-git-repo-check`.

Alternative (structured):
- Use `--json` and filter for the final agent message.
- Example: `codex exec --json "CONTEXT: WORKER ..." | jq -r 'select(.type=="item.completed" and .item.type=="agent_message") | .item.text'`

## Orchestration patterns (general-purpose)

Pick a pattern, then run it. Do not over-engineer.

### Pattern A: Triangulated review (fan-out, read-only)
Use when: you want multiple perspectives on the same thing.

Run 2 to 4 reviewers with different lenses, then merge.

Example lenses (choose what fits):
- Clarity/structure
- Correctness/completeness
- Risks/failure modes
- Consistency/style
- Evidence quality
- Practicality
- Accessibility/audience fit
- If relevant: security, performance, backward compatibility

Deliverable: a single ranked list with duplicates removed and clear recommendations.

### Pattern B: Review -> fix (serial chain)
Use when: you want a clean funnel.
1) Reviewer produces an issue list ranked by impact.
2) Implementer addresses the top items.
3) Verifier checks the result.

This works for code, documents, and analyses.

### Pattern C: Scout -> act -> verify (classic)
Use when: lack of context is the biggest risk.
1) Scout gathers the minimum context.
2) Orchestrator condenses it and chooses the approach.
3) Implementer executes.
4) Verifier sanity-checks.

### Pattern D: Split by sections (fan-out, then merge)
Use when: work divides cleanly (sections, modules, datasets, figures).
Each worker owns a distinct slice; merge for consistency.

### Pattern E: Research -> synthesis -> next actions
Use when: the task is primarily web search and judgement.
Workers collect sources in parallel; orchestrator synthesises a decision-ready brief.

### Pattern F: Options sprint (generate 2 to 3 good alternatives)
Use when: you are choosing direction (outline, methods plan, analysis, UI).
Workers propose options; orchestrator selects and refines one.

## Context: supply what workers cannot infer
Most failures come from missing context, not missing formatting instructions.

Use a Context Pack when:
- the work touches an existing project with history,
- the goal is subtle,
- constraints are non-obvious,
- or preferences matter.

Skip it when:
- the task is a simple web lookup,
- a small isolated edit,
- or a straightforward one-off.

### Context Pack (use as much or as little as needed)
- Goal: what "good" looks like.
- Non-goals: what not to do.
- Constraints: style, scope boundaries, must keep, must not change.
- Pointers: key files, folders, documents, notes, links.
- Prior decisions: why things are the way they are.
- Success check: how we know it is done (tests, criteria, checklist).

Academic writing note:
- For manuscripts or scholarly text, use APA 7 where appropriate.

## Worker prompt templates (neutral)

Prepend the Worker preamble to every worker prompt.

### Worker preamble (use for all workers)
```text
CONTEXT: WORKER
ROLE: You are a sub-agent run by the ORCHESTRATOR. Do only the assigned task.
RULES: No extra scope, no other workers.
Your final output will be provided back to the ORCHESTRATOR.
```

Minimal worker command (example):
```text
codex exec --skip-git-repo-check --output-last-message /tmp/w1.txt "CONTEXT: WORKER
ROLE: You are a sub-agent run by the ORCHESTRATOR. Do only the assigned task.
RULES: No extra scope, no other workers.
Your final output will be provided back to the ORCHESTRATOR.
TASK: <what to do>
SCOPE: read-only"
```

### Reviewer worker
CONTEXT: WORKER  
TASK: Review <artefact> and produce improvements.  
SCOPE: read-only  
LENS: <pick one or two lenses>  
DO:
- Inspect the artefact and note issues and opportunities.
- Prioritise what matters most.
OUTPUT:
- Top findings (ranked, brief)
- Evidence (where you saw it)
- Recommended fixes (concise, actionable)
- Optional: quick rewrite or outline snippet  
DO NOT:
- Expand scope
- Make edits

### Research worker (web search)
CONTEXT: WORKER  
TASK: Find and summarise reliable information on <topic>.  
SCOPE: read-only  
DO:
- Use web search.
- Prefer primary sources, official docs, and high-quality references.
OUTPUT:
- 5 to 10 bullet synthesis
- Key sources (with short notes on why they matter)
- Uncertainty or disagreements between sources  
DO NOT:
- Speculate beyond evidence

### Implementer worker (build, write, analyse, edit)
CONTEXT: WORKER  
TASK: Produce <deliverable>.  
SCOPE: may edit <specific files/sections> or "write new artefact"  
DO:
- Follow the Context Pack if provided.
- Make changes proportionate to the request.
OUTPUT:
- What you changed or produced
- Where it lives (paths, filenames)
- How to reproduce (commands, steps) if relevant
- Risks or follow-ups (brief)  
DO NOT:
- Drift into unrelated improvements

### Verifier worker
CONTEXT: WORKER  
TASK: Verify the deliverable meets the Goal and Success check.  
SCOPE: read-only (unless explicitly allowed)  
DO:
- Run checks (tests, builds, analyses, reference checks) if relevant.
- Look for obvious omissions and regressions.
OUTPUT:
- Pass/fail summary
- Issues with repro steps or concrete examples
- Suggested fixes (brief)

## Orchestrator habits (fast, not fussy)
- Skim the artefact yourself before delegating.
- Ask a quick clarification if a term or goal is ambiguous.
- Use parallel workers when it reduces time or uncertainty.
- Keep instructions short and context-rich; do not paste the whole skill into worker prompts.
- If a worker misunderstood, do not argue. Re-run with better context.
- Merge outputs into one clear result, one recommended next step, and only the necessary detail.

Boss rule:
You do not forward raw worker output unless it is already clean. You curate it.

How this skill works

  • Receives a complex task from the OpenClaw agent
  • Decomposes the task into subtasks using OpenAI Codex
  • Generates Python code for each subtask via Codex
  • Assembles subtasks and code into an orchestration plan
  • Returns the plan to the agent for execution

When to use it

  • When an agent needs to orchestrate complex multi-step tasks using Codex planning and execution
  • When breaking down high-level goals into executable sub-tasks managed by Codex
  • When coordinating dynamic workflows involving code generation, validation, and iteration with Codex

Best practices

  • Set OPENAI_API_KEY environment variable
  • Provide language, context, and examples in prompts for optimal results
  • Use low temperature (default 0.2) for consistent code generation
  • Limit MAX_TOKENS to control costs and output length
  • Monitor OpenAI API usage and costs via dashboard
  • Consider GPT-4 alternatives as Codex models are deprecated

Example use cases

  • Code completion with Codex: Completes partial code snippets using the Codex model based on a provided prompt.
  • Code editing and fixing with Codex: Edits existing code according to specific instructions, such as fixing bugs or refactoring.
  • Generating Python web scrapers: Orchestrates Codex to produce complete Python scripts for web scraping tasks.
  • Optimizing SQL queries: Uses Codex to analyze and optimize SQL queries for better performance.
  • Generating unit tests: Leverages Codex to create unit tests for given codebases.

FAQs

What is the name of the skill?

codex-orchestration

Who is the author of the skill?

shanelindsay

What is the version of the skill?

1.0.0

What is the description of the codex-orchestration skill?

This skill allows the agent to orchestrate complex coding tasks by breaking them into steps and using Codex for each step.

What are the capabilities of the codex-orchestration skill?

Plan multi-step coding tasks, Execute individual Codex generations, Iterate on code based on feedback, Synthesize final outputs

What tools does the codex-orchestration skill provide?

plan_task, generate_code, review_code, synthesize

What does the plan_task tool do?

Plan the steps for a coding task.

What is required for the plan_task tool?

task: string

What does the generate_code tool do?

Generate code for a specific step.

What is required for the generate_code tool?

prompt: string

What does the review_code tool do?

Review generated code and suggest improvements.

What is required for the review_code tool?

code: string

What does the synthesize tool do?

Combine multiple code pieces into final product.

What is required for the synthesize tool?

pieces: array of strings

More similar skills to explore

  • achurch

    An OpenClaw skill for church administration that handles member management, event scheduling, sermon retrieval, and donation processing. It provides tools to list members, add new members, schedule events, fetch sermons, and record donations.

  • agent-config

    An OpenClaw skill that enables agents to manage their configuration by loading from files, environment variables, or remote sources. It supports retrieving, setting, and validating configuration values. The skill allows for hot-reloading of configurations.

  • agent-council

    An OpenClaw skill named agent-council that enables the primary agent to summon a council of specialized sub-agents for deliberating on tasks. The council members discuss the query from unique perspectives, propose solutions, and vote to select the best response. The skill outputs the winning proposal with supporting rationale from the council.

  • agent-identity-kit

    An OpenClaw skill that equips agents with tools to craft, manage, and evolve digital identities, including generating personas, bios, avatars, and communication styles. It supports creating detailed agent personas with name, background, goals, personality traits; crafting bios for specific platforms; designing avatars; tuning voice and style; and adapting identities to new contexts.

  • agenticflow-skill

    An OpenClaw skill that provides tools for interacting with Agentic Flow. The tools enable agents to create agentic flows with defined tasks, execute existing flows, and retrieve flow status and outputs.

  • agentlens

    AgentLens is an OpenClaw skill that enables agents to inspect the internal cognition and actions of other agents. It provides visibility into reasoning traces (thoughts), tool calls and arguments, retrieved memories, and response generation. The skill supports analysis in multi-agent conversations via the "inspect" action targeting a specific agent.