AITP-Research-Protocol

AITP Research Protocol transforms AI coding agents into disciplined theoretical-physics collaborators.

<div align="center"> <img src="./.github/assets/aitp-mark.svg" width="116" alt="AITP AI monogram" /> <h1>AITP Research Charter and Kernel</h1> <p><strong>A research protocol that turns your AI coding agent into a disciplined theoretical-physics collaborator — one that keeps evidence separate from conjecture, remembers your topics across sessions, and only promotes results to trusted memory after you approve them.</strong></p> <p> <a href="#how-it-works">How it works</a> · <a href="#protocol-and-current-implementation">Protocol & Implementation</a> · <a href="#installation">Installation</a> · <a href="#current-use-paths">Using AITP</a> · <a href="#the-basic-workflow">Workflow</a> · <a href="#research-model">Research Model</a> · <a href="#whats-inside">What's Inside</a> · <a href="#philosophy">Philosophy</a> · <a href="#read-next">Docs</a> </p> </div> <p align="center"> <img src="./.github/assets/aitp-hero.svg" width="100%" alt="AITP protocol-first research kernel hero with AI monogram" /> </p> <p align="center"> <a href="https://github.com/bhjia-phys/AITP-Research-Protocol/stargazers"><img src="https://img.shields.io/github/stars/bhjia-phys/AITP-Research-Protocol?style=flat-square&labelColor=0B1220&color=FFC857" alt="GitHub stars" /></a> <a href="https://github.com/bhjia-phys/AITP-Research-Protocol/network/members"><img src="https://img.shields.io/github/forks/bhjia-phys/AITP-Research-Protocol?style=flat-square&labelColor=0B1220&color=7EE8F2" alt="GitHub forks" /></a> <a href="https://github.com/bhjia-phys/AITP-Research-Protocol/issues"><img src="https://img.shields.io/github/issues/bhjia-phys/AITP-Research-Protocol?style=flat-square&labelColor=0B1220&color=FF9B71" alt="GitHub issues" /></a> <a href="./LICENSE"><img src="https://img.shields.io/badge/license-MIT-EAF4FF?style=flat-square&labelColor=0B1220" alt="MIT License" /></a> <img src="https://img.shields.io/badge/kernel-L0--L4-9CF6EC?style=flat-square&labelColor=0B1220" alt="L0 to L4 kernel" /> <img src="https://img.shields.io/badge/research-protocol--first-8BEFF7?style=flat-square&labelColor=0B1220" alt="Protocol-first" /> </p>

How it works

It starts when you describe what you want to study — in plain language, the way you would explain it to a colleague. AITP takes that description and turns it into a bounded research topic with a clear question, scope, and validation plan. You don't need to learn any special commands; just talk.

Once the topic is set up, your AI agent does the actual research work — gathering sources, reading papers, sketching derivations, or running benchmarks — inside a layered protocol that keeps every piece of evidence traceable. Exploratory notes and tentative claims stay clearly labeled as such. Nothing gets promoted to "trusted" status just because the agent sounds confident.

When the work reaches a natural checkpoint, AITP presents what it found, what gaps remain, and whether the results are ready for your review. You decide what gets promoted to reusable memory. This is the human gate — the agent executes, but you own the trust decisions.

Everything is durable. You can close your laptop, come back days later, say "continue this topic," and the agent picks up exactly where it left off — with full context of what was done, what was decided, and what is still open. For the full experience, see docs/USER_TOPIC_JOURNEY.md.

Protocol and Current Implementation

This repository contains both:

  1. the AITP research protocol
  2. the current public reference implementation of that protocol

The protocol is the durable contract: layers, trust boundaries, promotion rules, runtime artifacts, and the rule that evidence stays separate from conjecture.

The implementation is what you actually run today:

  • the installable kernel under research/knowledge-hub/
  • the aitp and aitp-mcp entrypoints
  • the runtime scripts and topic-shell materialization
  • the current Codex / OpenCode / Claude Code / OpenClaw front doors

The important boundary is:

  • the protocol is the durable model
  • the current Python kernel and front-door integrations are one implementation of that model

Another implementation would still count as AITP only if it preserves the same layer semantics, durable artifact model, evidence boundaries, and governed promotion into trusted memory.

Installation

Kernel Install

python -m pip install aitp-kernel
aitp --version
aitp doctor

Then install the platform adapter you use:

Codex (recommended)

aitp install-agent --agent codex --scope user

See .codex/INSTALL.md for details.

OpenCode

Add to your opencode.json:

{ "plugin": ["aitp@git+https://github.com/bhjia-phys/AITP-Research-Protocol.git"] }

See .opencode/INSTALL.md for details.

Claude Code

aitp install-agent --agent claude-code --scope user

This path now installs both the SessionStart bootstrap and the native Claude Code AITP MCP registration.

See docs/INSTALL_CLAUDE_CODE.md for details.

OpenClaw

aitp install-agent --agent openclaw --scope user

See docs/INSTALL_OPENCLAW.md for details.

For contributor/local-dev editable installs, Windows-specific instructions, migration from older installs, and troubleshooting, see docs/INSTALL.md.

What aitp doctor Means

aitp doctor is the current install/bootstrap truth surface.

Use it to confirm:

  • the kernel is installed correctly
  • the fixed protocol roots are present
  • your selected front door is wired correctly
  • the current machine has enough structure to run the bounded first-use path

Important boundary:

  • aitp doctor proves front-door readiness
  • it does not by itself prove deep-execution parity across every runtime

For machine-readable inspection, run:

aitp doctor --json

Current Use Paths

The current implementation supports three honest entry styles.

1. Agent-front-door usage

After installing the adapter for Codex, OpenCode, Claude Code, or OpenClaw, you can start from a natural-language request inside that agent surface.

That is the most protocol-native experience:

  • describe the idea
  • let AITP route it into a bounded topic
  • continue through the topic shell and runtime protocol artifacts

2. Explicit topic bootstrap

If you want the most direct CLI path from idea to topic:

aitp bootstrap --topic "<topic>" --statement "<initial idea or question>"
aitp loop --topic-slug <topic_slug> --human-request "Continue with the next bounded step."
aitp status --topic-slug <topic_slug>

This is the current verified first-run path for the public kernel.

Useful follow-up reads:

  • aitp replay-topic --topic-slug <topic_slug>
  • aitp capability-audit --topic-slug <topic_slug>
  • aitp paired-backend-audit --topic-slug <topic_slug>
  • aitp h-plane-audit --topic-slug <topic_slug>
  • aitp compile-source-catalog
  • aitp trace-source-citations --canonical-source-id <canonical_source_id>
  • aitp compile-source-family --source-type paper
  • aitp export-source-bibtex --canonical-source-id <canonical_source_id>
  • aitp import-bibtex-sources --topic-slug <topic_slug> --bibtex-path <path-to-bib-file>

3. Lightweight idea-first exploration

If the idea is still too loose for full topic bootstrap:

aitp explore "Sketch the idea before opening a full topic loop."

That writes a lightweight exploration carrier instead of the full topic shell. When the idea becomes specific enough, promote it into normal topic work with promote-exploration or start a full topic explicitly.

The Basic Workflow

  1. Topic bootstrap — You describe what you want to study. AITP sets up a research topic with a bounded question, scope, and validation contract. No special commands needed.

  2. Source acquisition (L0) — AITP gathers papers, notes, and upstream references. Everything is traceable back to its origin.

  3. Analysis and exploration (L1, L3) — The agent reads, annotates, sketches derivations, or runs benchmarks. Exploratory outputs are clearly labeled as candidates, not conclusions.

  4. Validation and trust audit (L4) — When results look promising, AITP runs explicit checks — consistency, convergence, reproduction — before asking whether the work is ready.

  5. Promotion to reusable memory (L2) — Only after your explicit approval does material move into long-term trusted memory. The agent cannot promote on its own.

Research Model

AITP keeps research state in layers instead of flattening everything into one chat transcript.

LayerPurposeWhat goes here
L0Source acquisitionpapers, notes, upstream code references
L1Provisional understandinganalysis notes, derivation sketches
L3Exploratory outputscandidate claims, tentative material
L4Validation and trust auditchecks, benchmarks, human decisions
L2Long-term trusted memorypromoted knowledge, reusable workflows

The default route is L0 → L1 → L3 → L4 → L2. Layer 2 is intentionally last — exploratory work does not become reusable memory just because the agent sounds confident.

flowchart LR
    A[Research question] --> B[aitp bootstrap]
    B --> C[L0 source traces]
    C --> D[L1 provisional analysis]
    D --> E[L3 candidate outputs]
    E --> F[L4 validation and trust audit]
    F --> G{Human approval gate}
    G -->|approved| H[L2 reusable memory]
    G -->|not yet| E

What's Inside

Three Research Lanes

The same protocol kernel drives different categories of theoretical-physics work.

LaneTypical inputsValidationTrusted output
Formal theory and derivationpapers, definitions, prior claimsproof-gap analysis, consistency checkssemi-formal theory objects, Lean-ready packets
Toy-model numericsmodel specs, observables, scriptsconvergence checks, benchmarksvalidated workflows, reusable operations
Code-backed algorithm developmentupstream codebases, existing methodsreproduction, trust audittrusted methods, backend writeback

Capabilities

  • Multi-topic runtime — Work on several research topics in one workspace. Switch between them with natural language.
  • Cross-session memory — Every topic survives session resets. Resume days later with full context.
  • Lean-ready export — Bridge validated theory results into Lean 4 declaration packets with proof-obligation sidecars.
  • Bounded autonomous execution — Run multi-step research loops with explicit human gates at decision points (OpenClaw).
  • L1 three-layer vault — Materialize raw/wiki/output intake vaults with explicit flowback receipts on the existing topic-shell path.
  • Statement compilation before proof repair — Compile bounded theory statements into declaration skeletons and explicit proof-repair plans before Lean-facing export.
  • L2 compiler helpers — Seed and inspect reusable knowledge views with commands like aitp seed-l2-direction --direction tfim-benchmark-first, aitp consult-l2 --query-text "TFIM exact diagonalization benchmark workflow" --retrieval-profile l3_candidate_formation, aitp compile-l2-graph-report, and aitp compile-l2-knowledge-report.

Runtime Support

RuntimeInstall pathRole
Codex.codex/INSTALL.mdBaseline — cleanest end-to-end experience
OpenCode.opencode/INSTALL.mdPlugin-based natural-language routing
Claude Codedocs/INSTALL_CLAUDE_CODE.mdSessionStart bootstrap
OpenClawdocs/INSTALL_OPENCLAW.mdBounded autonomous research loops

Run aitp doctor --json to check what is converged on your machine.

Current baseline: Codex. Parity target: Claude Code and OpenCode. Specialized lane: OpenClaw. aitp doctor reports front-door readiness only. Deep-execution parity is a separate surface. Across Codex, Claude Code, and OpenCode, the front door now publishes the same plain-language human-control posture and autonomous-continuation posture in session_start.generated.md and runtime_protocol.generated.md. When no real checkpoint is active, AITP should continue bounded work without ritual reconfirmation; in verify + iterative_verify, the bounded L3-L4 loop is allowed to keep cycling until success, a real blocker, or a real human checkpoint appears. The current bounded parity probes are available for Claude Code and OpenCode via python research/knowledge-hub/runtime/scripts/run_runtime_parity_acceptance.py --runtime <runtime> --json. The cross-runtime closure report is available via python research/knowledge-hub/runtime/scripts/run_runtime_parity_audit.py --json. The bounded L1 raw/wiki/output vault acceptance is available via python research/knowledge-hub/runtime/scripts/run_l1_vault_acceptance.py --json. The bounded L1 assumption/reading-depth acceptance is available via python research/knowledge-hub/runtime/scripts/run_l1_assumption_depth_acceptance.py --json. The bounded L1 method-specificity acceptance is available via python research/knowledge-hub/runtime/scripts/run_l1_method_specificity_acceptance.py --json. The bounded L1 concept-graph intake acceptance is available via python research/knowledge-hub/runtime/scripts/run_l1_concept_graph_acceptance.py --json. The bounded Layer 0 source-catalog acceptance is available via python research/knowledge-hub/runtime/scripts/run_source_catalog_acceptance.py --json. The bounded runtime transition/demotion acceptance is available via python research/knowledge-hub/runtime/scripts/run_transition_history_acceptance.py --json. The bounded promotion-gate human-modification acceptance is available via python research/knowledge-hub/runtime/scripts/run_human_modification_record_acceptance.py --json. The bounded competing-hypotheses acceptance is available via python research/knowledge-hub/runtime/scripts/run_competing_hypotheses_acceptance.py --json. The bounded statement-compilation acceptance is available via python research/knowledge-hub/runtime/scripts/run_statement_compilation_acceptance.py --json.

The machine-readable install view exposes:

  • runtime_convergence
  • deep_execution_parity
  • full_convergence_repair
  • runtime_support_matrix.runtimes.<runtime>.remediation
  • runtime_support_matrix.deep_execution_parity.runtimes.<runtime>.status

Windows local-checkout note:

  • scripts\aitp-local.cmd doctor
  • scripts\aitp-local.cmd bootstrap --topic "<topic>" --statement "<statement>"

Useful runtime audit entrypoints once a topic exists:

  • aitp capability-audit --topic-slug <topic_slug>
  • aitp paired-backend-audit --topic-slug <topic_slug>
  • aitp h-plane-audit --topic-slug <topic_slug>

Philosophy

  • Evidence before confidence — sources stay separate from speculation at every layer
  • Bounded steps, not freestyle — every unit of work has a clear question and scope
  • Humans own trust — nothing becomes reusable memory without explicit approval
  • Durable by default — research state survives session resets and machine changes
  • Light until it matters — ordinary work stays minimal; the runtime only expands when something important happens

Current Status

  • Ships one public reference implementation of the AITP protocol under research/knowledge-hub
  • Supports Codex, OpenCode, Claude Code, and OpenClaw front doors over the same kernel
  • Has a verified first-run path: bootstrap -> loop -> status
  • Has a verified lightweight idea-first path: explore -> promote-exploration
  • Has an explicit human approval gate before L2 promotion
  • Bridges into the Theoretical-Physics-Knowledge-Network formal-theory backend

What is mostly done:

  • protocol surface coverage for the current L0-L4 kernel
  • install/bootstrap/front-door adoption surface
  • runtime control plane, H-plane, layer graph, and route-transition visibility

What is not yet fully proven:

  • real-topic end-to-end research utility across L0 -> L1 -> L3 -> L4 -> L2
  • deep-execution parity on every non-Codex runtime
  • full maturity of the L2 knowledge surface and statement-compilation pipeline

Contributing

AITP stabilizes the research protocol, not one frozen implementation. Contributions that preserve the layer model, durable artifacts, evidence boundaries, and governed promotion gates are welcome.

See docs/CHARTER.md for what counts as disciplined AI-assisted theoretical-physics work. See docs/AITP_GSD_WORKFLOW_CONTRACT.md for the boundary between research-topic work in AITP and implementation work in GSD.

License

MIT License — see LICENSE file for details.

Read Next