43+
Tools
File I/O, shell, search, web, MCP, task control
View settings
Theme
Locale
Open agent harness, mapped as a reading experience
A long-form study page that reframes the oh runtime as a sequence of decisions: what the model decides, what the harness executes, and where safety, memory, and coordination sit in between.
43+
Tools
File I/O, shell, search, web, MCP, task control
54
Commands
Slash-command surface for planning, resume, auth, plugins
114
Passing Tests
Unit, integration, E2E, real skills, and UI coverage
3
API Formats
Anthropic, OpenAI-compatible, and GitHub Copilot
Study section
OpenHarness presents the harness not as UI sugar around a model, but as the operational layer that gives an LLM tools, observation, memory, and enforceable boundaries.
Core framing
OpenHarness is positioned as lightweight agent infrastructure for researchers, builders, and the wider open-agent community.
Understand
Inspect how a production-flavored agent runtime is stitched together from simple parts.
Experiment
Swap providers, extend tools, and prototype coordination patterns without hiding the runtime.
Extend
Add custom plugins, custom skills, and custom tools without abandoning familiar conventions.
Operate
Run it interactively in the TUI or script it through text, JSON, and stream-json outputs.
Harness equation
The repository explains the harness as everything wrapped around the model so an agent can act safely in the world.
Intelligence lives in the model. Execution discipline lives in the runtime.
Open source posture
This is what makes the project a good study subject: it is opinionated enough to feel real, yet transparent enough to read as architecture rather than product marketing.
Study section
The heart of the repo is not a one-shot completion call. It is a repeated cycle of streaming, deciding, gating, executing, returning evidence, and asking again.
Step 01
The user arrives through the CLI or the React/Ink terminal UI, where runtime settings and session context are already in play.
Step 02
Prompts, CLAUDE-style docs, loaded skills, config layers, and memory are stitched into one model-facing bundle.
Step 03
The provider client streams tokens and tool intents instead of blocking on one final response.
Step 04
Every tool call passes through permission mode, path rules, command deny lists, and lifecycle hooks before execution.
Step 05
Files, shell, web, MCP servers, tasks, and subagents all sit behind the same harness boundary.
Step 06
Tool outputs are appended, the model re-reasons over new evidence, and the loop continues until it stops asking for tools.
Reference pseudocode
while True:
response = await api.stream(messages, tools)
if response.stop_reason != "tool_use":
break
for tool_call in response.tool_uses:
result = await harness.execute_tool(tool_call)
messages.append(tool_results)Reading takeaway
The model chooses what to do next.
The harness decides how that action is validated, observed, executed, and folded back into context.
That separation is the main architectural lesson running through the entire repo.
Execution map
This is the section that most directly mimics the explanation-first spirit of ccunpacked.dev: a visual map that turns architecture into a walkable story.
User prompt
A request starts in the CLI or the terminal UI with runtime config already attached.
Prompt assembly
System prompt, CLAUDE-style docs, skills, memory, and settings are merged.
Provider stream
The backend streams text and tool intents instead of waiting for one final blob.
Permission gate
Modes, path rules, and denied commands decide what is allowed to execute.
Tool runtime
Files, shell, web, MCP, tasks, and subagents are invoked behind one contract.
Observed result
Hooks fire, results return to the model, and the loop continues or stops.
Study section
README describes the project as a harness architecture spread across runtime subsystems. Read them less like folders and more like responsibility boundaries.
engine/
Streams responses, detects tool_use, and keeps the execution cycle composable.
tools/
Forty-three plus tools cover file work, shell access, search, web fetch, MCP, and task orchestration.
skills/
Markdown skills load only when needed so the harness can stay lightweight until specialization is required.
plugins/
Commands, hooks, agents, and MCP servers keep Claude-style extensions portable.
permissions/
Modes, path rules, and denied commands form the guardrails between intent and execution.
hooks/
PreToolUse and PostToolUse events let the runtime observe or shape behavior around execution.
memory/
Persistent memory and session resume preserve working context across interactions.
tasks/
Task lifecycle primitives keep long-running or delegated work visible and queryable.
coordinator/
Subagent spawning and team coordination make the harness more than a single loop.
ui/
The React/Ink TUI gives the system a conversational shell with permission dialogs and command picking.
Tool categories
OpenHarness is useful to study because it does not stop at text generation. It wires many execution families through one permissioned interface.
File I/O
Bash, Read, Write, Edit, Glob, Grep
Search
WebFetch, WebSearch, ToolSearch, LSP
Agents and Tasks
Agent, SendMessage, TeamCreate, TaskCreate, TaskList, TaskStop
MCP and Modes
MCPTool, ListMcpResources, ReadMcpResource, EnterPlanMode, Worktree
Study section
A good study site should not just show features. It should suggest how to read the project depending on why you opened it.
For researchers
Study how a practical agent harness separates model intelligence from operational scaffolding.
For builders
Use OpenHarness as a blueprint for tools, skills, plugins, and permissioned execution.
For operators
Treat it as an inspectable local runtime that can be scripted, resumed, and tested.
Study section
The repo is intentionally easy to boot: one install command, one demo prompt, then provider switching through either flags or environment variables.
Install
curl -fsSL https://raw.githubusercontent.com/HKUDS/OpenHarness/main/scripts/install.sh | bashOne-command demo
ANTHROPIC_API_KEY=your_key uv run oh -p "Inspect this repository and list the top 3 refactors"Setup sequence
Detect the OS and verify Python plus Node availability.
Install the Python package and optionally the React TUI dependencies.
Create the local OpenHarness config directory.
Launch with a provider profile or environment variables.
Provider compatibility
Anthropic format
Default Claude-oriented path. Also supports Anthropic-compatible gateways such as Moonshot, Vertex-style, and Bedrock-style endpoints.
OpenAI-compatible
Works with OpenAI, DashScope, DeepSeek, GitHub Models, SiliconFlow, Groq, Ollama, and more through /v1/chat/completions.
GitHub Copilot
Uses GitHub OAuth device flow, stores session auth, and avoids direct API key setup for the backend.
Study section
The strongest signal in the repository is not just breadth. It is the combination of breadth, test scaffolding, extension hooks, and a readable mental model.
6
CLI flags E2E
Real-model coverage for command-line behavior and options.
9
Harness E2E
Retry, skills, permissions, and parallel execution are explicitly exercised.
7
TUI suites
Welcome flow, conversation flow, status, commands, shortcuts, and permission interactions.
12
Real extensions
Skills and plugin compatibility are tested against actual external packages.
Three closing reads
OpenHarness is most interesting when read as a decomposed agent runtime, not just as another CLI tool.
Its biggest lesson is that serious agent behavior emerges from many small operational decisions: schemas, permissions, prompts, hooks, lifecycle management, and user-facing affordances.
That makes it a strong study target for anyone comparing open harness designs or planning a runtime of their own.
Source notes
This page is a design-inspired study aid, not an official OpenHarness microsite.