~/your-repo · zsh
# 1. scaffold the four Ralph artifacts in your repo
$ npx ralpharium init
↳ created PROMPT.md, AGENTS.md, IMPLEMENTATION_PLAN.md, specs/, .ralph/
 
# 2. boot the daemon — opens browser at localhost:3000
$ npx ralpharium start
↳ daemon up · websocket open · http://localhost:3000
 
# 3. (optional) other entry points
$ npx ralpharium dashboard
$ npx ralpharium ram
$ npx ralpharium start --port=4000
live on npm as ralpharium
Live on npm · npx ralpharium init · runs entirely on your machine

AI coding loops, under control.

Ralpharium is a small local app that runs AI coding loops — Claude, Codex, Aider, or any shell command — and gives you a dashboard to start, watch, and stop them safely. See what the AI changed, why it stopped, and whether it's safe to keep going. Nothing leaves your machine.

0 Iterations recorded
— / — Plan tasks done
0 Specs tracked
offline Backend status
▚ RAM MONITOR seg ralph_studio_blackboard used —/— events 0/0 proc no runner ctx — · ~0 tok mode idle _ RalphCorp v0.2
ralpharium · live
connecting
LOOP runner PROMPT PLAN AGENTS SPECS TESTS LINT COMMIT GUARDS
  ops snapshot · repo updated just now
connecting to backend...
LocalRuns on your machine
Any AIClaude · Codex · Aider · custom
One-clickStart / pause / stop
Test-gatedStops on test failure
ReplayInspect any iteration
About Ralpharium

The Ralph Loop, made observable.

Ralpharium is a local-first control plane for autonomous AI coding loops. You point it at a repo, pick a runner (Claude / Codex / Aider / any shell command), and click Start. It runs the AI in a tight cycle, gates each pass on tests and lint, and shows you exactly what changed — every iteration, every commit, every failure — in a dashboard you control. Nothing leaves your machine.

What it is, in one screen

A local-first dashboard for autonomous AI coding loops.

A small Python daemon

Runs in your terminal at localhost:3000. Spawns your AI runner as a subprocess, watches stdout, parses output, persists every iteration. No cloud, no accounts, no telemetry.

An operator dashboard

Start / pause / panic-stop the loop. Iteration timeline, plan health, validation backpressure, spec coverage, and a live event stream — all updating without a refresh.

8 specialized agents in shared memory

Spec writer, researcher, planner, builder, reviewer, debugger, magpie, tagger. Each owns a slot in a 64KB multiprocessing.shared_memory segment and is observable on the RAM page.

Replayable iteration history

Every loop pass is appended to .ralph/iterations.jsonl: mode, status, files changed, commit SHA, validation, last 4KB of runner output. Click any row to inspect.

Not a replacement for Claude / Codex / Aider — it runs them. Pick the runner, configure the command, click Start. Ralpharium provides the context, gates, observation, and stop conditions around it.
The original

Vanilla Ralph Loop — powerful, but rough.

The Ralph Wiggum technique by @ghuntley is a methodology: run Claude in a bash while loop, treat it as a capable-but-naive executor, steer behavior with engineered context (PROMPT.md, AGENTS.md, IMPLEMENTATION_PLAN.md, specs/*.md) re-read every iteration. It works — but it has real pain points:

  • Black box. All you see is bash stdout streaming past. No way to inspect intermediate state.
  • One monolithic prompt. Plan, code, validate, classify — all done in a single call. No specialization.
  • Manual stop only. If Ralph rewrites the same file 5 times with the same error, you don't notice until you Ctrl-C and read the log.
  • No replay. Past iterations live in git log at best. The reasoning, validation output, and prompts used are gone.
  • No structured guardrails. Stop conditions are whatever your bash script remembers to check.

Original playbook: github.com/ghuntley/how-to-ralph-wiggum

Ralpharium's upgrades

What we kept, fixed, and added.

  • From black box → glass box.

    Every part of an iteration is observable in real time. Live blackboard, hex view of the shared-memory segment, event ring buffer, per-agent drill-down with prompt + decision history, replayable JSONL on disk.

  • From 1 monolithic loop → 8 specialized agents.

    spec_writer + researcher handle phase-1 context. planner + builder drive the build cycle. reviewer + debugger own validation backpressure. magpie + tagger capture and classify each iteration's output. Each has its own state, latency, and history.

  • From .md files on disk → 64KB shared memory.

    Agents pass structured work through an OS-level multiprocessing.shared_memory segment instead of re-reading the same files every iteration. Fewer disk hits, instant cross-agent visibility, observable as live JSON.

  • From manual Ctrl-C → real stop conditions.

    One-toggle gates: stop_on_failure, stop_if_no_commit, stop_if_dirty_before_run, max_iterations. Plus a thrash detector that pauses the loop when the same files modified 3+ times produce the same failure — before tokens get burned.

  • From "tail bash logs" → replayable iterations.

    Append-only .ralph/iterations.jsonl stores every iteration's mode, status, files changed, validation result, command output (last 4KB), commit SHA, and failure reason. Click any row in the timeline to inspect or re-run.

  • From single-shell session → REST + WebSocket API.

    Any tool can POST /api/iterations to start a pass and PATCH when it finishes. The dashboard subscribes to a live WebSocket — multiple tabs and external clients see the same state.

Inside an iteration

What actually happens when you click Start.

A single iteration is one full pass of the 8-agent cycle. The daemon orchestrates it; the dashboard streams every step over WebSocket. Here's the sequence, end to end:

  1. A
    Pre-flight — Planner + Researcher

    The Planner reads IMPLEMENTATION_PLAN.md and picks the first unchecked task. The Researcher scans the repo (branch, dirty state, spec count) and writes that context to the shared blackboard so every other agent sees the same snapshot.

  2. B
    Build phase — Builder runs the subprocess

    The Builder spawns your runner command (e.g. claude -p "$(cat PROMPT.md)") with cwd set to your repo. Stdout streams line-by-line into the RAM event log, parsed for errors and test mentions. The blackboard's last_error, test_output, and pid slots update in real time.

  3. C
    Backpressure — Reviewer + Debugger

    When the subprocess exits, the Reviewer runs your validation gates (npm test / lint / typecheck / build) auto-detected from package.json. If anything fails, the Debugger classifies the failure (which check, what kind of error) and writes it to the blackboard. With stop_on_failure on, the loop halts here.

  4. D
    Post-loop — Magpie + Tagger

    If the iteration produced a real commit, the Magpie collects the artifacts (commit SHA, files changed, scratchpad notes) and the Tagger classifies the iteration (feature / fix / refactor / docs) by looking at file paths and the iteration summary.

  5. E
    Persist + broadcast

    The full iteration record (mode, status, files changed, commit SHA, validation results, last 4KB of runner output, failure reason) is appended to .ralph/iterations.jsonl. The daemon broadcasts a fresh snapshot over WebSocket — your dashboard timeline, RAM agent grid, plan health, and event stream all update without a refresh.

  6. F
    Loop or stop

    The loop checks stop conditions (max iterations hit, panic, dirty tree, no commit, validation failed). If clean, it goes back to step A and picks the next plan task. The thrash detector watches for "same files modified with same error 3+ times" and pauses you before tokens are wasted.

Key idea: the AI runner does the actual code work. Ralpharium provides the context, validation, observation, and stop conditions around it — turning a one-shot prompt into a steerable, replayable loop.
How it works

From terminal to dashboard in five steps.

The whole product fits on one chalkboard. No accounts, no cloud — Ralpharium runs on your machine, talks to your AI CLI, and shows you what happened.

  1. 1 Scaffold the repo

    Run inside any folder. Creates PROMPT.md, AGENTS.md, IMPLEMENTATION_PLAN.md, specs/, and .ralph/.

    npx ralpharium init
  2. 2 Boot the daemon

    Spawns the FastAPI server, opens the dashboard automatically at localhost:3000.

    npx ralpharium start
  3. 3 Pick a runner + click Start

    In Loop configuration, choose Claude / Codex / Aider / custom. Empty command = safe fallback (so you can demo without auth).

  4. 4 Watch it work

    Iteration timeline, plan health, validation, agent activity stream live. Open npx ralpharium ram for the under-the-hood debug view.

    npx ralpharium ram
  5. 5 Stop or continue

    Pause between iterations with Space, panic-stop with Ctrl/⌘+P, or let it loop until your stop conditions hit.

Under the hood

The tech.

Backend

Python 3.11+ FastAPI daemon. WebSocket for live updates, REST for first paint. Single Uvicorn process — no database, no message queue.

Shared memory

An OS-level multiprocessing.shared_memory segment holds the live blackboard JSON. The 8 agents read and write structured slots; everything is observable in real time.

Frontend

Vanilla JS + CSS. No framework, no build step, no bundler. The dashboard at /dashboard and the live debug view at /ram are static files served by the daemon.

Persistence

Append-only JSONL at .ralph/iterations.jsonl. Every iteration's mode, status, files changed, validation result, and commit SHA — replayable, greppable, gitignorable.

Backpressure

Auto-detects npm test / lint / typecheck / build from package.json (and Python equivalents). One-click validation, with optional stop-on-failure.

CLI

Tiny Node launcher (bin/ralph-studio.js) that spawns the Python daemon. The npm bin is ralpharium (and ralph-studio for back-compat).

Plain English

Five words you'll see everywhere.

Runner
The AI coding CLI you choose — Claude Code, Codex, Aider, OpenRouter via aider, or any custom shell command.
Loop
One run of the runner. A "continuous loop" is repeated runs with stop conditions you set.
Dashboard
The control room at /dashboard. Start, pause, stop. Watch iterations stream in.
RAM
The live debug page at /ram. Real-time internals — 8 agents, blackboard, event stream, hex viewer.
Backpressure
Tests, lint, typecheck, and git checks that gate the loop. If they fail, Ralpharium stops the run before bad code lands.
Install

Two commands to a working dashboard.

Live on npm as ralpharium. Drop it into any repo, init scaffolds the files, start opens the control room at localhost:3000. Nothing leaves your machine.

Requires
  • Node 18+ — comes with npx. Check: node -v
  • Python 3.11+ — backend daemon. Check: python --version (or python3 --version). Get it at python.org if missing.
  • Optional: Claude / Codex / Aider CLI — only needed for real iterations. Empty runner command = safe fallback so you can demo without auth.

FastAPI & uvicorn auto-install on first run — you don't have to do anything.

~/your-repo · zsh
# 1. scaffold the four Ralph artifacts in your repo
$ npx ralpharium init
↳ created PROMPT.md, AGENTS.md, IMPLEMENTATION_PLAN.md, specs/, .ralph/
 
# 2. boot the daemon — opens browser at localhost:3000
$ npx ralpharium start
↳ daemon up · websocket open · http://localhost:3000
 
# 3. (optional) other entry points
$ npx ralpharium dashboard
$ npx ralpharium ram
$ npx ralpharium start --port=4000
npm install -g ralpharium for global install requires Node 18+ or Python 3.11+
Per iteration

Every run leaves a trail.

Eight things get captured every time the AI runs: the prompt it read, the plan tasks it touched, what it ran, what it changed, what passed or failed, and what it committed. You can replay any of them.

01PromptPROMPT.md re-readinput
02Plantasks parsedstate
03Spec matchspecs/* mappedscope
04Runrunner subprocessexecute
05Validatetests · lint · buildbackpressure
06Commitatomic git commitrecord
07Replaydrill into outputaudit
08Guardrailrules suggestedharden
Files the AI reads

A few markdown files, parsed for you.

The AI reads a small set of files in your repo each iteration — Ralpharium reads them too, so you always know what the AI is working from.

PROMPT.md

per-iteration instruction

Re-read at the start of every iteration. Studio shows file presence, last-modified, and the active prompt mode.

artifactrequired

AGENTS.md

operational rules

Build/test commands, scope limits, commit hygiene. Studio surfaces it and proposes new rules from failure history.

artifactrecommended

IMPLEMENTATION_PLAN.md

persistent state

Parsed into tasks: completed, pending, blocked, stale. Drift and repeated-work warnings raised automatically.

artifactparsed

specs/*.md

source of truth

Each spec is mapped against plan tasks and recent commits — covered, partial, drifting, or ignored.

artifactcoverage

Iteration log

append-only history

Every iteration's mode, status, files changed, commit SHA, validation result, and command output — replayable.

historyJSONL

Backpressure

validation gate

Auto-detects npm test, lint, typecheck, build, plus working-tree status. One click to run.

checksauto-detected

Commits

atomic record

Recent commits surfaced inline. Iterations link to the SHA they produced — or flag if no commit was created.

gitper iteration

Guardrails

learned rules

Repeat-failure heuristics turn into AGENTS.md suggestions. "Tests failed 3×, require npm test."

suggestionshistory-driven
  iteration anatomy JSONL
field
type
role
id · number
str · int
unique handle, monotonic counter
mode
plan / build
which prompt was used
status
enum
running · passed · failed · stopped
files_changed
list[str]
derived from commit
commit_sha
str?
linked to git history
validation
list[check]
tests/lint/typecheck results
command_output
str (tail)
last 4KB of runner stdout
failure_reason
str?
only on failure
Iteration anatomy

Every loop pass is replayable.

When Ralph finishes — or fails — Studio writes a structured record to .ralph/iterations.jsonl. Click any row in the timeline to see the prompt used, the plan diff, the validation output, and the resulting commit.

  • Append-only.

    Local JSONL. Easy to tail -f, easy to ship to wherever later.

  • External hooks.

    POST /api/iterations from any CLI to start an iteration; PATCH when it finishes.

  • Convergence signals.

    Failure clustering, dirty trees, missing commits — surfaced as guardrail suggestions.

Dashboard

What the dashboard does.

Six clearly-labelled cards: loop status, iteration timeline, plan health, validation, spec coverage, and a live runtime panel. Click anything for full detail.

Loop status header

Mode, iteration, runner, branch, dirty tree — and one-click start, pause, stop, panic.

controlsWebSocket

Iteration timeline

Recent passes with mode, status, duration, files changed, commit SHA. Click to replay.

historylocal

Plan health

Tasks parsed from IMPLEMENTATION_PLAN.md. Next, blocked, stale. Drift warnings.

parserregex

Backpressure

Auto-detected validation. Tests, lint, typecheck, build, git clean. Run any of them on demand.

subprocess1-click

Spec coverage map

Each spec is covered, partial, drifting, or ignored — derived from plan + commits.

heuristicper file

Guardrails

Surfaces PROMPT.md and AGENTS.md, plus rules learned from repeat failures.

suggestionshistory-driven
Roadmap

What's shipping next.

Built in the open. Every milestone is real work — no vapor, no "trust me bro." If a phase says shipped, it's already on your machine when you install. If it says next, the spec is written and the wiring is half-done.

0

Foundations

Shipped
  • Local FastAPI daemon + WebSocket dashboard
  • Iteration JSONL store at .ralph/iterations.jsonl
  • Plan parser, spec coverage, validation backpressure
  • RAM page: blackboard, event stream, scratchpad, checkpoints
1

8 agents, made observable

Shipped
  • AgentRoster with 8 fixed roles wired into the iteration lifecycle
  • 8-card grid with click-through prompt + decision history
  • Thrash detector — "Ralph keeps Ralphing the wrong thing"
  • Live broadcast on every agent transition + localStorage cache
2

Real autonomy

Next · in progress
  • Cost telemetry per iteration. Parse runner stdout for tokens, surface $/iteration, $/session, $/spec_completed.
  • Git-worktree-per-iteration sandbox. No edits land on main until validation passes. Auto-rollback on failure.
  • Bidirectional question channel. Agent writes a question, loop pauses, you reply, loop resumes — without losing context.
  • Time + token budgets. Kill runaway iterations after N seconds or M tokens.
3

Multiplayer + resilience

Soon
  • Daemon crash recovery. Replay iterations.jsonl on boot — pick up exactly where the loop died.
  • Multi-client presence. Two devs watching the same loop see each other's cursors and selections.
  • Replay on a different model. Re-run a past iteration on Sonnet vs Opus vs Haiku and diff the outputs.
  • Real per-agent LLM calls. Phase 1 synthesizes agent activity from lifecycle events. Phase 3 = each of the 8 agents is a real LLM call with its own system prompt and history.

North star

The dream
  • Self-spec'ing loops. The runner writes specs/*.md from a chat thread, then builds against them. Phase-1 of Wiggum, automated.
  • Multi-repo orchestration. One Ralpharium daemon coordinating loops across a monorepo's services or a fleet of microrepos.
  • Visual diff approval. Each commit lands on a feature branch; dashboard shows the diff with one-click approve/revert.
  • Public agent marketplace. Drop-in custom agents (security_reviewer, perf_auditor, accessibility_critic) you compose into your roster.
Want something on this list sooner? File an issue on GitHub or open a PR. Local-first stays local — but the roadmap is open.
Ready when you are

Run a loop. Watch what changes.

Open the dashboard, pick a runner, click Start. If something fails, the loop stops. If it commits, you see the SHA. Everything stays on your machine.