View settings

Theme

Locale

OpenHarness study websiteInspired by ccunpacked.devAnalysis date: 2026-04-05

Open agent harness, mapped as a reading experience

OpenHarness, unpacked for people building agent systems.

A long-form study page that reframes the oh runtime as a sequence of decisions: what the model decides, what the harness executes, and where safety, memory, and coordination sit in between.

43+

Tools

File I/O, shell, search, web, MCP, task control

54

Commands

Slash-command surface for planning, resume, auth, plugins

114

Passing Tests

Unit, integration, E2E, real skills, and UI coverage

3

API Formats

Anthropic, OpenAI-compatible, and GitHub Copilot

01

Study section

The Harness Thesis

OpenHarness presents the harness not as UI sugar around a model, but as the operational layer that gives an LLM tools, observation, memory, and enforceable boundaries.

Core framing

OpenHarness is positioned as lightweight agent infrastructure for researchers, builders, and the wider open-agent community.

Understand

Inspect how a production-flavored agent runtime is stitched together from simple parts.

Experiment

Swap providers, extend tools, and prototype coordination patterns without hiding the runtime.

Extend

Add custom plugins, custom skills, and custom tools without abandoning familiar conventions.

Operate

Run it interactively in the TUI or script it through text, JSON, and stream-json outputs.

Harness equation

The repository explains the harness as everything wrapped around the model so an agent can act safely in the world.

Intelligence lives in the model. Execution discipline lives in the runtime.

tools + knowledge + observation + action + permissions

Open source posture

MIT licensedAnthropic-style skills compatibilityClaude-style plugins compatibilityReact/Ink terminal UI

This is what makes the project a good study subject: it is opinionated enough to feel real, yet transparent enough to read as architecture rather than product marketing.

02

Study section

The Agent Loop

The heart of the repo is not a one-shot completion call. It is a repeated cycle of streaming, deciding, gating, executing, returning evidence, and asking again.

Step 01

Prompt enters the harness

loop study

The user arrives through the CLI or the React/Ink terminal UI, where runtime settings and session context are already in play.

Step 02

Context gets assembled

loop study

Prompts, CLAUDE-style docs, loaded skills, config layers, and memory are stitched into one model-facing bundle.

Step 03

Streaming response begins

loop study

The provider client streams tokens and tool intents instead of blocking on one final response.

Step 04

Permission and hooks fire

loop study

Every tool call passes through permission mode, path rules, command deny lists, and lifecycle hooks before execution.

Step 05

The world gets touched

loop study

Files, shell, web, MCP servers, tasks, and subagents all sit behind the same harness boundary.

Step 06

Results return to the model

loop study

Tool outputs are appended, the model re-reasons over new evidence, and the loop continues until it stops asking for tools.

Reference pseudocode

while True:
    response = await api.stream(messages, tools)

    if response.stop_reason != "tool_use":
        break

    for tool_call in response.tool_uses:
        result = await harness.execute_tool(tool_call)

    messages.append(tool_results)

Reading takeaway

The model chooses what to do next.

The harness decides how that action is validated, observed, executed, and folded back into context.

That separation is the main architectural lesson running through the entire repo.

Execution map

A readable model of the runtime path.

This is the section that most directly mimics the explanation-first spirit of ccunpacked.dev: a visual map that turns architecture into a walkable story.

User prompt

A request starts in the CLI or the terminal UI with runtime config already attached.

Prompt assembly

System prompt, CLAUDE-style docs, skills, memory, and settings are merged.

Provider stream

The backend streams text and tool intents instead of waiting for one final blob.

Permission gate

Modes, path rules, and denied commands decide what is allowed to execute.

Tool runtime

Files, shell, web, MCP, tasks, and subagents are invoked behind one contract.

Observed result

Hooks fire, results return to the model, and the loop continues or stops.

03

Study section

Architecture Explorer

README describes the project as a harness architecture spread across runtime subsystems. Read them less like folders and more like responsibility boundaries.

engine/

Loop runtime

Streams responses, detects tool_use, and keeps the execution cycle composable.

tools/

Action surface

Forty-three plus tools cover file work, shell access, search, web fetch, MCP, and task orchestration.

skills/

On-demand knowledge

Markdown skills load only when needed so the harness can stay lightweight until specialization is required.

plugins/

Extension layer

Commands, hooks, agents, and MCP servers keep Claude-style extensions portable.

permissions/

Safety fence

Modes, path rules, and denied commands form the guardrails between intent and execution.

hooks/

Lifecycle intercepts

PreToolUse and PostToolUse events let the runtime observe or shape behavior around execution.

memory/

Cross-session recall

Persistent memory and session resume preserve working context across interactions.

tasks/

Background work

Task lifecycle primitives keep long-running or delegated work visible and queryable.

coordinator/

Multi-agent control

Subagent spawning and team coordination make the harness more than a single loop.

ui/

Terminal experience

The React/Ink TUI gives the system a conversational shell with permission dialogs and command picking.

Tool categories

The action surface is intentionally broad.

OpenHarness is useful to study because it does not stop at text generation. It wires many execution families through one permissioned interface.

File I/O

Bash, Read, Write, Edit, Glob, Grep

Search

WebFetch, WebSearch, ToolSearch, LSP

Agents and Tasks

Agent, SendMessage, TeamCreate, TaskCreate, TaskList, TaskStop

MCP and Modes

MCPTool, ListMcpResources, ReadMcpResource, EnterPlanMode, Worktree

04

Study section

Study Tracks

A good study site should not just show features. It should suggest how to read the project depending on why you opened it.

For researchers

Study how a practical agent harness separates model intelligence from operational scaffolding.

Trace where the model decides what to do.
Trace where the harness decides how it is allowed to do it.
Compare loop control against other agent runtimes.

For builders

Use OpenHarness as a blueprint for tools, skills, plugins, and permissioned execution.

Lift the tool contract pattern and JSON schema design.
Borrow the CLAUDE-style skill and plugin portability story.
Prototype task and subagent workflows without building everything from zero.

For operators

Treat it as an inspectable local runtime that can be scripted, resumed, and tested.

Run headless with text, JSON, or stream-json outputs.
Swap providers through environment variables or CLI flags.
Lean on tests and permission modes before widening trust.
05

Study section

Quick Start and Provider Surface

The repo is intentionally easy to boot: one install command, one demo prompt, then provider switching through either flags or environment variables.

Install

curl -fsSL https://raw.githubusercontent.com/HKUDS/OpenHarness/main/scripts/install.sh | bash

One-command demo

ANTHROPIC_API_KEY=your_key uv run oh -p "Inspect this repository and list the top 3 refactors"

Setup sequence

01

Detect the OS and verify Python plus Node availability.

02

Install the Python package and optionally the React TUI dependencies.

03

Create the local OpenHarness config directory.

04

Launch with a provider profile or environment variables.

Provider compatibility

Anthropic format

Default Claude-oriented path. Also supports Anthropic-compatible gateways such as Moonshot, Vertex-style, and Bedrock-style endpoints.

OpenAI-compatible

Works with OpenAI, DashScope, DeepSeek, GitHub Models, SiliconFlow, Groq, Ollama, and more through /v1/chat/completions.

GitHub Copilot

Uses GitHub OAuth device flow, stores session auth, and avoids direct API key setup for the backend.

06

Study section

Evidence and Takeaways

The strongest signal in the repository is not just breadth. It is the combination of breadth, test scaffolding, extension hooks, and a readable mental model.

6

CLI flags E2E

Real-model coverage for command-line behavior and options.

9

Harness E2E

Retry, skills, permissions, and parallel execution are explicitly exercised.

7

TUI suites

Welcome flow, conversation flow, status, commands, shortcuts, and permission interactions.

12

Real extensions

Skills and plugin compatibility are tested against actual external packages.

Three closing reads

OpenHarness is most interesting when read as a decomposed agent runtime, not just as another CLI tool.

Its biggest lesson is that serious agent behavior emerges from many small operational decisions: schemas, permissions, prompts, hooks, lifecycle management, and user-facing affordances.

That makes it a strong study target for anyone comparing open harness designs or planning a runtime of their own.