# Coco-Wiki / CodeWiki Tokio Handoff

Date: 2026-06-22

This is the current handoff for the Tokio documentation comparison project. It records what was tried, what is currently hosted, where the relevant code lives, how CodeWiki generates its pages, how Coco-Wiki currently generates its pages, and how to rerun the important pieces.

## Hosted Result

- Coco-Wiki Tokio: <https://tokio-wiki-compare.pages.dev/coco/>
- CodeWiki Tokio snapshot: <https://tokio-wiki-compare.pages.dev/codewiki/>
- Side-by-side compare page: <https://tokio-wiki-compare.pages.dev/compare/>
- Handoff page: <https://tokio-wiki-compare.pages.dev/handoff/>
- Current verified preview deployment: <https://5434405a.tokio-wiki-compare.pages.dev/>

Current notable result:

- The Coco multi-thread scheduler page has been replaced with a CodeWiki-learned full-source prompt prototype.
- Page URL: <https://tokio-wiki-compare.pages.dev/coco/#tokio_runtime_scheduler_multi_thread>
- Markdown URL: <https://tokio-wiki-compare.pages.dev/coco/pages/tokio_runtime_scheduler_multi_thread.md>
- It is 21.6KB, contains 4 Mermaid diagrams, and was validated with Mermaid 11.9.0 after syntax fixes.

Important scope note:

- Only `tokio_runtime_scheduler_multi_thread` has the CodeWiki-learned full-source style applied.
- The other 23 Coco pages are still the previous strong-model rewrite over frozen `ComponentDossier` / `ModuleNotebook` artifacts.

## Local Paths

Main Coco-Wiki repo:

```text
/Users/zhhanz/Documents/rust/src/coco-wiki
```

Cloned CodeWiki source:

```text
/tmp/CodeWiki-src
```

Static comparison site root:

```text
/tmp/tokio-wiki-compare-site
```

Clean deploy directory, recreated before deploy:

```text
/tmp/tokio-wiki-compare-deploy
```

Coco generated deliverable:

```text
deliverables/tokio-research-wiki
```

Current important deliverable subpaths:

```text
deliverables/tokio-research-wiki/index.html
deliverables/tokio-research-wiki/manifest.json
deliverables/tokio-research-wiki/pages/
deliverables/tokio-research-wiki/pages-strong/
deliverables/tokio-research-wiki/pages-deterministic-v2/
deliverables/tokio-research-wiki/pages-codewiki-learned/
deliverables/tokio-research-wiki/artifacts/dossiers/
deliverables/tokio-research-wiki/artifacts/research/
deliverables/tokio-research-wiki/artifacts/notebooks/
deliverables/tokio-research-wiki/artifacts/sections/
deliverables/tokio-research-wiki/codewiki-learned-writer-prompt.md
```

Hosted site copy:

```text
/tmp/tokio-wiki-compare-site/coco/index.html
/tmp/tokio-wiki-compare-site/coco/pages/
/tmp/tokio-wiki-compare-site/coco/pages-codewiki-learned/
/tmp/tokio-wiki-compare-site/codewiki/index.html
/tmp/tokio-wiki-compare-site/assets/codewiki-pages.json
/tmp/tokio-wiki-compare-site/assets/codewiki-module-tree.json
/tmp/tokio-wiki-compare-site/handoff/index.html
```

## What Was Tried

1. Design review and v0.3 architecture direction

   The original `coco-wiki-DESIGN.md` was reviewed against RepoDoc, RepoSummary, Skyframe, Salsa, SCIP, RANGER, and Aider repo map ideas. The recommended design became:

   ```text
   Boundary-first Structural Tree
   + Capability Lenses
   + CodeIntel Graph
   + Capability Graph
   + Artifact Graph
   + Red-Green evaluation
   ```

2. Implementation correctness rounds

   Multiple implementation review rounds fixed root correctness issues around artifact persistence, verification gating, exact dirty checking, candidate impact vs exact impact, revision digests, dependency equality, Ask routing, CAS concurrency, relationship indexing, incremental snapshot restamping, build gating, and structured claims/evidence/verification storage.

   The later reported state was:

   ```text
   workspace tests: 205 passing
   acceptance harness: 23/23 green, later 31/31 for added acceptance checks
   ```

3. Source-rich GLM 5.2 run

   Coco-Wiki was wired to include source snippets as deterministic evidence and attempted a GLM 5.2 source-rich generation pass. It published 24/24 pages with verification passing and clean rerun avoided model calls.

   Result: better than symbol-only pages, but still too shallow compared with CodeWiki. The main problem was not only prompt wording or source snippets. The writer still lacked a CodeWiki-like module research context and people-oriented module guide structure.

4. ResearchQuestionArtifact / ModuleNotebook executable v1

   `cw-research` was added as an explicit semantic research layer:

   ```text
   ModuleSpec
   -> ComponentDossier
   -> Completeness Gate
   -> ResearchQuestionArtifact
   -> ModuleNotebook
   -> PageSectionArtifact
   -> Markdown
   -> HTML viewer
   ```

   Current generated Tokio artifact counts:

   ```text
   pages: 24
   component dossiers: 24
   research artifacts: 120
   notebooks: 24
   section artifact sets: 24
   frozen citations: 390
   recorded query footprints: 3165
   known gaps: 0
   ```

5. Strong-model writer pass

   All 24 pages were rewritten with local `claude -p --model sonnet` from frozen deterministic evidence. This improved prose but still did not match CodeWiki's scheduler page quality.

6. CodeWiki prompt/source study

   CodeWiki was cloned to `/tmp/CodeWiki-src`, and the relevant files were read:

   ```text
   /tmp/CodeWiki-src/codewiki/src/be/prompt_template.py
   /tmp/CodeWiki-src/codewiki/src/be/documentation_generator.py
   /tmp/CodeWiki-src/codewiki/src/be/backend.py
   /tmp/CodeWiki-src/codewiki/src/be/caw_backend.py
   /tmp/CodeWiki-src/codewiki/src/be/agent_tools/read_code_components.py
   /tmp/CodeWiki-src/codewiki/src/be/agent_tools/generate_sub_module_documentations.py
   ```

   The key finding: CodeWiki quality is not from a magic prompt. It comes from module tree + full source grouped by file + agentic submodule docs + parent overview synthesis.

7. CodeWiki-learned scheduler prototype

   A new prompt contract was written:

   ```text
   deliverables/tokio-research-wiki/codewiki-learned-writer-prompt.md
   ```

   A reusable generator script was added:

   ```text
   scripts/generate_codewiki_learned_scheduler.py
   ```

   The generated page was applied to:

   ```text
   deliverables/tokio-research-wiki/pages-codewiki-learned/tokio_runtime_scheduler_multi_thread.md
   deliverables/tokio-research-wiki/pages/tokio_runtime_scheduler_multi_thread.md
   /tmp/tokio-wiki-compare-site/coco/pages/tokio_runtime_scheduler_multi_thread.md
   ```

8. Mermaid 11.9.0 fix

   The first CodeWiki-learned page had Mermaid syntax errors. The fixed problems were:

   ```text
   subgraph id with hyphen: Per-Worker
   subgraph label with brackets: Remote[i]
   state labels with :: such as Launch::launch()
   sequence message text with semicolons
   ```

   Current 4 Mermaid blocks parse under local Mermaid 11.9.0.

## How CodeWiki Works

CodeWiki's main pipeline is in:

```text
/tmp/CodeWiki-src/codewiki/src/be/documentation_generator.py
```

The flow is:

```text
DependencyGraphBuilder
-> components + leaf nodes
-> LLM clustering into module_tree
-> leaf modules first
-> parent modules after children
-> repository overview after module docs
-> metadata.json / module_tree.json / overview.md / module pages
```

Important CodeWiki prompt files:

```text
/tmp/CodeWiki-src/codewiki/src/be/prompt_template.py
```

Key prompt pieces:

- `SYSTEM_PROMPT`: comprehensive system documentation, architecture, component relationships, sub-module docs, Mermaid diagrams, and tool workflow.
- `LEAF_SYSTEM_PROMPT`: similar but without recursive sub-module generation.
- `USER_PROMPT`: gives `<MODULE_TREE>` and `<CORE_COMPONENT_CODES>`.
- `CLUSTER_REPO_PROMPT` / `CLUSTER_MODULE_PROMPT`: ask the model to group component ids into semantic modules.
- `format_user_prompt`: groups core components by file and injects full file content.

The most important quality detail is this CodeWiki behavior:

```text
for each core component id:
  group by component.relative_path

for each file:
  write:
    # File: path
    ## Core Components in this file
    - component ids
    ## File Content
    full file text
```

This means CodeWiki writers are often reading the actual implementation, not only extracted symbols or snippets.

Important CodeWiki tools:

```text
/tmp/CodeWiki-src/codewiki/src/be/agent_tools/read_code_components.py
/tmp/CodeWiki-src/codewiki/src/be/agent_tools/generate_sub_module_documentations.py
/tmp/CodeWiki-src/codewiki/src/be/agent_tools/str_replace_editor.py
```

What they do:

- `read_code_components`: lets the agent fetch source for additional component ids.
- `generate_sub_module_documentation`: recursively runs sub-agents for complex modules.
- `str_replace_editor`: writes docs through CodeWiki's controlled editor path.

CodeWiki backend abstraction:

```text
/tmp/CodeWiki-src/codewiki/src/be/backend.py
```

It supports:

- API mode through pydantic-ai/litellm.
- Subscription CLI mode through `caw`, using local `claude` or `codex`.

CodeWiki installed CLI examples from its README:

```bash
cd /tmp/CodeWiki-src
python3.12 -m venv .venv
source .venv/bin/activate
pip install -e .

# Claude subscription mode, after `claude login`
codewiki config set \
  --provider claude-code \
  --main-model sonnet \
  --cluster-model sonnet

codewiki generate --output /tmp/codewiki-tokio-docs --verbose
```

The hosted CodeWiki Tokio snapshot is partial:

```text
/tmp/tokio-wiki-compare-site/assets/codewiki-pages.json
/tmp/tokio-wiki-compare-site/assets/codewiki-module-tree.json
```

Snapshot stats:

```text
generated pages: 17
known complete: false
stopped at: runtime_scheduler_multi_thread / worker_core
```

## How Coco-Wiki Works In This Project

Core design code:

```text
crates/cw-research/src/lib.rs
crates/cw-research/src/tokio_wiki.rs
crates/cw-fastcontext/src/lib.rs
crates/cw-cli/src/commands/research.rs
```

Deterministic Tokio wiki generator:

```text
crates/cw-research/src/tokio_wiki.rs
```

It defines 24 `ModuleSpec`s for Tokio and materializes:

```text
ComponentDossier
ResearchQuestionArtifact
ModuleNotebook
PageSectionArtifact
Markdown
single-file HTML viewer
manifest
```

This generator does not call a model. It is reproducible and uses write-if-changed semantics so clean reruns skip unchanged outputs.

FastContext boundary:

```text
crates/cw-fastcontext/src/lib.rs
```

It implements an OpenAI-compatible client for `microsoft/FastContext-1.0-4B-SFT` and read-only repository tools:

```text
READ
GLOB
GREP
```

It records footprints:

```text
Read(path, revision_fp, start_line, end_line)
Glob(pattern, path_set_fp, path_count)
Grep(pattern, scope, match_set_fp, match_count)
SymbolLookup(query, result_set_fp, result_count)
GraphLookup(seed, relation_filter, adjacency_fp, edge_count)
SemanticFrontier(query, candidate_set_fp, selected_k, frontier_m, cutoff)
```

FastContext is deliberately a research scout, not a live writer tool. Its output should be frozen as `ResearchQuestionArtifact` before any page writer sees it.

## How To Generate The Deterministic Coco Tokio Wiki

From the repo root:

```bash
cd /Users/zhhanz/Documents/rust/src/coco-wiki

cargo run -p cw-cli --quiet -- research tokio-wiki \
  --repo ~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.3 \
  --output deliverables/tokio-research-wiki
```

Expected output is JSON like:

```text
page_count: 24
component_dossiers: 24
research_artifacts: 120
notebooks: 24
section_artifacts: 24
html_index: deliverables/tokio-research-wiki/index.html
manifest_path: deliverables/tokio-research-wiki/manifest.json
```

Clean rerun should report mostly unchanged output. If it rewrites everything, check for volatile fields being serialized into the manifest or generated pages.

Open locally:

```bash
open deliverables/tokio-research-wiki/index.html
```

## How To Validate A Research Artifact

Example:

```bash
cargo run -p cw-cli --quiet -- research validate-footprints \
  --repo ~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.3 \
  --json \
  deliverables/tokio-research-wiki/artifacts/research/tokio_runtime_scheduler_multi_thread_control_flow.json
```

Green means the recorded read/glob/grep/index-style footprints still match the current source tree.

## How To Run FastContext From Rust

Serve FastContext with SGLang or vLLM. The notes assume SGLang:

```bash
python3 -m sglang.launch_server \
  --model-path ~/.cache/huggingface/fastcontext/FastContext-1.0-4B-SFT \
  --tool-call-parser qwen \
  --context-length 262144 \
  --trust-remote-code \
  --dtype bfloat16 \
  --host 127.0.0.1 \
  --port 30000 \
  --tp-size 1 \
  --mem-fraction-static 0.8
```

Run a live scout query:

```bash
cargo run -p cw-cli -- fastcontext explore \
  --repo ~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.3 \
  "Find where the multi-thread scheduler parks and unparks workers."
```

Freeze a research artifact:

```bash
cargo run -p cw-cli -- research explore \
  --repo ~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.52.3 \
  --module runtime::scheduler::multi_thread \
  --question-id worker_lifecycle \
  --output /tmp/worker_lifecycle.json \
  "How does the multi-thread worker lifecycle run, park, unpark, and shut down?"
```

Aggregate artifacts into a notebook:

```bash
cargo run -p cw-cli -- research notebook \
  --module runtime::scheduler::multi_thread \
  --output /tmp/runtime_scheduler_multi_thread_notebook.json \
  /tmp/worker_lifecycle.json
```

## How To Generate The CodeWiki-Learned Scheduler Page

The reusable script is:

```text
scripts/generate_codewiki_learned_scheduler.py
```

Prompt-only dry run:

```bash
scripts/generate_codewiki_learned_scheduler.py \
  --no-call \
  --prompt-out /tmp/scheduler_prompt.md
```

Generate with local Claude CLI:

```bash
scripts/generate_codewiki_learned_scheduler.py \
  --model sonnet \
  --max-budget-usd 2.50
```

The script sends the prompt through stdin to avoid OS argument length limits. It writes:

```text
deliverables/tokio-research-wiki/pages-codewiki-learned/tokio_runtime_scheduler_multi_thread.md
```

Apply that generated page to the local viewer and hosted-site copy:

```bash
scripts/apply_tokio_page_to_viewer.py \
  --markdown deliverables/tokio-research-wiki/pages-codewiki-learned/tokio_runtime_scheduler_multi_thread.md
```

This updates:

```text
deliverables/tokio-research-wiki/pages/tokio_runtime_scheduler_multi_thread.md
deliverables/tokio-research-wiki/index.html
/tmp/tokio-wiki-compare-site/coco/pages/tokio_runtime_scheduler_multi_thread.md
/tmp/tokio-wiki-compare-site/coco/pages-codewiki-learned/tokio_runtime_scheduler_multi_thread.md
/tmp/tokio-wiki-compare-site/coco/index.html
```

## How To Validate Mermaid 11.9.0

The current scheduler page had to be made Mermaid-safe. Local validation used Mermaid 11.9.0 in `/tmp/tokio-wiki-compare-site/node_modules`.

Install dependencies if needed:

```bash
cd /tmp/tokio-wiki-compare-site
npm install -D mermaid@11.9.0 jsdom dompurify
```

The temporary validation script used during this run is:

```text
/tmp/parse_mermaid_dom.mjs
```

Run:

```bash
node /tmp/parse_mermaid_dom.mjs \
  /Users/zhhanz/Documents/rust/src/coco-wiki/deliverables/tokio-research-wiki/pages/tokio_runtime_scheduler_multi_thread.md
```

Expected:

```text
block 1: ok
block 2: ok
block 3: ok
block 4: ok
```

Mermaid 11.9.0 guardrails for future generated pages:

```text
avoid hyphens in subgraph ids
avoid [brackets] inside subgraph labels
avoid :: in state transition labels
avoid semicolons in sequence message text
avoid raw HTML-ish generic text such as <Core> in diagram labels
```

## How To Deploy The Comparison Site

Wrangler is installed locally in:

```text
/tmp/tokio-wiki-compare-site/node_modules/.bin/wrangler
```

The project cache identifies:

```text
project_name: tokio-wiki-compare
```

Recreate a clean deploy directory. This avoids uploading `node_modules`, `package.json`, `.wrangler`, or lockfiles:

```bash
rm -rf /tmp/tokio-wiki-compare-deploy
mkdir -p /tmp/tokio-wiki-compare-deploy
rsync -a --delete \
  --exclude node_modules \
  --exclude package.json \
  --exclude package-lock.json \
  --exclude .wrangler \
  /tmp/tokio-wiki-compare-site/ \
  /tmp/tokio-wiki-compare-deploy/
```

Deploy:

```bash
cd /tmp/tokio-wiki-compare-site
./node_modules/.bin/wrangler pages deploy \
  /tmp/tokio-wiki-compare-deploy \
  --project-name tokio-wiki-compare
```

Last successful deploy preview:

```text
https://5434405a.tokio-wiki-compare.pages.dev
```

Verify production content:

```bash
python3 - <<'PY'
import time, urllib.request
urls = [
  f"https://tokio-wiki-compare.pages.dev/coco/pages/tokio_runtime_scheduler_multi_thread.md?cb={int(time.time())}",
  f"https://tokio-wiki-compare.pages.dev/coco/?cb={int(time.time())}",
  f"https://tokio-wiki-compare.pages.dev/handoff/?cb={int(time.time())}",
]
for url in urls:
    text = urllib.request.urlopen(urllib.request.Request(url, headers={"User-Agent": "Mozilla/5.0"}), timeout=30).read().decode()
    print(url, len(text))
    for phrase in [
        "Task Source Decision Table",
        "Worker Lifecycle and Core Hand-off",
        "CodeWiki-learned scheduler prototype",
        "Coco-Wiki / CodeWiki Tokio Handoff",
    ]:
        if phrase in text:
            print("  found", phrase)
PY
```

## How To Run The Normal Coco-Wiki CLI Flow

For a real repo flow, not the special Tokio research generator:

```bash
cargo run -p cw-cli -- init --force
cargo run -p cw-cli -- index run
cargo run -p cw-cli -- plan run
cargo run -p cw-cli -- build run
cargo run -p cw-cli -- ask "how does Scheduler work?"
cargo run -p cw-cli -- verify run
cargo run -p cw-cli -- explain-dirty run tokio_runtime_scheduler_multi_thread
```

Runtime options:

```bash
cargo run -p cw-cli -- build run --runtime native
cargo run -p cw-cli -- build run --runtime cocoindex
```

The CocoIndex runtime adapter examples live under:

```text
examples/cocoindex_runtime_demo/
examples/cocoindex_cw_runtime_adapter/
```

Example scripts:

```bash
examples/cocoindex_runtime_demo/run.sh
examples/cocoindex_cw_runtime_adapter/run.sh
examples/cocoindex_cw_runtime_adapter/cocoindex_e2e_incremental.sh
```

## Current Quality Assessment

The latest scheduler page is materially closer to CodeWiki because it is now shaped as a human module guide:

```text
introduction
how it fits
architecture overview
key design principles
sub-module documentation
task scheduling flow
worker lifecycle
shutdown procedure
state/concurrency invariants
performance notes
evidence map
incrementality note
```

The prior Coco page was accurate but read like an artifact/evidence report. CodeWiki's better page quality came from giving the model full source context and a documentation task framed around maintainers.

Remaining gap:

- Only one Coco page has been regenerated with the CodeWiki-learned contract.
- CodeWiki still has a stronger recursive module-doc workflow.
- Coco still needs this contract generalized across all 24 Tokio pages and then moved into the production buildgraph, instead of remaining a special scheduler script.

Recommended next step:

```text
generalize scripts/generate_codewiki_learned_scheduler.py
into a page-id driven generator:

  page_id
  -> manifest entry
  -> source-file pack from dossier/source spans
  -> CodeWiki-learned prompt
  -> writer
  -> Mermaid validation
  -> apply_tokio_page_to_viewer.py
```

After that, batch regenerate the 5 golden pages first:

```text
tokio_runtime
tokio_runtime_scheduler
tokio_runtime_scheduler_multi_thread
tokio_runtime_io_driver
tokio_sync_mpsc
```

Then regenerate all 24 pages only if the golden-page quality holds.

## Files Added In This Handoff

```text
scripts/generate_codewiki_learned_scheduler.py
scripts/apply_tokio_page_to_viewer.py
notes/18-full-project-handoff.md
```

## Security / Secrets Note

Do not put provider keys into this repo, the handoff page, command history snippets, or generated docs. Use local CLI auth (`claude login`, `codex login`) or environment variables outside committed files. The hosted handoff intentionally describes provider wiring without including any secret values.
