Application Model¶
claude-code-log reads Claude Code transcript files (JSONL on disk) and
produces readable HTML, Markdown, and structured JSON views, with
optional caching, a TUI for navigation, and per-project aggregate
pages.
This document is the entry point for dev-docs/: a high-level view of
the parts, what each does, and where to read about them in detail. For
end-user documentation see the project README.md;
for contributor onboarding see CONTRIBUTING.md;
for user-facing operations docs see docs/.
1. Subsystems at a glance¶
| Subsystem | Owner module(s) | Deep-dive |
|---|---|---|
| CLI | cli.py |
inlined below (§ 2.1) |
| TUI | tui.py |
inlined below (§ 2.2) |
| Cache (SQLite) | cache.py + migrations/ |
inlined below (§ 2.3); user-facing in docs/restoring-archived-sessions.md |
| Migrations | migrations/ + migrations/runner.py |
inlined below (§ 2.4) |
| Parsing | parser.py, factories/ |
rendering-architecture.md § 3 |
| Message taxonomy | models.py |
messages.md |
| DAG (sessions, forks, agents) | dag.py |
dag.md |
| Sync sub-agents (#79) | converter.py, factories/agent_metadata_factory.py |
agents.md § 1 |
| Async task agents (#90) | converter.py, factories/task_notification_factory.py |
agents.md § 2 |
| Teammates (#91) | renderer.py, factories/teammate_factory.py, html/teammate_formatter.py |
teammates.md |
| Rendering pipeline | renderer.py, html/, markdown/, json/ |
rendering-architecture.md |
| Fold-bar / message hierarchy | html/templates/components/, JS in transcript.html |
message-hierarchy.md |
| CSS class taxonomy | html/templates/components/*.css |
css-classes.md |
| JSON export (#36) | json/ |
inlined below (§ 2.5) |
| Detail-level filter | renderer.py § Detail-level filtering, models.DetailLevel |
inlined below (§ 2.6) |
| Image export | image_export.py |
inlined below (§ 2.7) |
| Performance profiling | renderer_timings.py |
inlined below (§ 2.8) |
| Diagnosing hangs (SIGUSR1) | cli.py _install_stack_dump_signal |
inlined below (§ 2.9) |
| Adding a new tool renderer | factories/tool_factory.py, html/tool_formatters.py |
implementing-a-tool-renderer.md (how-to) |
| Plugin system (third-party message transformers) | plugins.py, factories/priorities.py, Renderer._dispatch_format |
plugins.md |
A note on cross-cutting concerns: some behaviour spans several rows
of the table above and isn't owned by any single subsystem. Label
and preview composition (session header titles, branch labels,
fork-point box captions) is the most common one — it touches the
DAG layer (which decides what's a branch), the renderer's session
machinery (which assembles the label text), and the parsing layer
(which feeds the preview source). See the SessionHeaderMessage
entry in § 4 for the function-level surface.
2. Subsystems without their own deep-dive¶
The subsystems above with "inlined below" pointers don't have a dedicated dev-doc — the paragraph here is the canonical reference.
2.1 CLI¶
cli.py is the command-line entry point
(claude-code-log) built on Click. The default invocation processes
the entire ~/.claude/projects/ hierarchy; explicit paths target a
single transcript or directory. Major flags:
--tui— launch the interactive TUI (§ 2.2).--detail {full,high,low,minimal,user-only}— drop content from the rendered output (§ 2.6).--from-date "yesterday",--to-date "today"— natural-language date filtering viadateparser.--open-browser— open the generatedindex.htmlafter rendering.--no-cache/--update-cache— bypass or force-refresh the SQLite cache (§ 2.3).--format {html,md,markdown,json}— switch output format (HTML is the default; Markdown is mainly used for sharing transcripts inline; JSON exports the processed tree for downstream tooling — see § 2.5).--compact— Markdown-only; suppresses repeated headings.--page-size N— paginate the combined-transcript HTML/Markdown output, packing whole sessions into pages of up to N messages each (sessions are never split across pages, so individual pages may overflow). Per-session HTML files are not paginated.
CLI orchestration delegates to converter.py (which owns the
high-level "load + render + write" flow) and never touches renderer.py
directly. Output paths follow a stable convention so the cache and
re-renders can find existing files: combined_transcripts.html,
session-{id}.html, index.html, with --detail and --compact
adding suffixes per utils.variant_suffix.
2.2 TUI¶
tui.py is a Textual application that
browses the projects index, drills into individual sessions, and
exposes quick actions: render session to HTML, resume a session via
claude --resume, archive a session (move to cache-only), and so on.
Architecture is straightforward Textual: a few Screen subclasses,
a DataTable for the session list, key bindings dispatched through
Textual's BINDINGS mechanism. The TUI reads through cache.py
exclusively (never re-parses JSONL itself) — opening a 50-project
hierarchy takes milliseconds because cache hydration is incremental.
The "archive" action is interesting: it moves a session's source JSONL
out of ~/.claude/projects/ while keeping the cache row intact. The
session then renders from cache only. See
docs/restoring-archived-sessions.md
for the user-facing behaviour and recovery flow.
2.3 Cache (SQLite)¶
cache.py maintains a SQLite database
at ~/.claude/projects/claude-code-log-cache.db (or
$CLAUDE_CODE_LOG_CACHE_PATH). Stored data:
- Per-session: id, summary, first/last timestamps, message count,
per-role token totals,
team_name(added in migration 005). - Per-message: a denormalised view used by archived-session restoration (the cache holds enough to re-render even after the source JSONL is deleted).
- Per-rendered-HTML: the HTML output itself, indexed by source file mtime + detail-level + compact flag (migrations 002–004) — so re-runs with unchanged inputs serve the cached HTML directly.
Invalidation is mtime-based: when a JSONL's mtime is newer than its cache row, the session is reparsed. The schema-version row also invalidates the entire HTML cache when migrations bump the version, since rendered output may have changed even when source data hasn't.
For the operations / recovery side (archived sessions, manual
deletion, cleanupPeriodDays), see
docs/restoring-archived-sessions.md.
2.4 Migrations¶
claude_code_log/migrations/ is a
small migration system. Each migration is a NNN_description.sql file
applied in numeric order by migrations/runner.py. The schema-version
table tracks which migrations have run; cache.py invokes the runner
on every connection open, so a fresh checkout running against an old
cache DB transparently upgrades.
Current migrations:
001_initial_schema.sql— sessions table + per-message metadata.002_html_cache.sql— adds the rendered-HTML cache layer.003_html_pagination.sql/004_html_pagination_variant.sql— per-page HTML chunks for--page-size.005_session_team_name.sql— addsteam_nameto sessions for the teammates feature (PR #125).
Recreating-tables migrations toggle PRAGMA foreign_keys = OFF/ON
around the rebuild to avoid losing rows to cascade-deletes during the
swap.
2.5 JSON export¶
claude_code_log/json/ is a thin renderer
that mirrors HtmlRenderer / MarkdownRenderer: same
generate(...) / generate_session(...) / generate_projects_index(...)
surface, same --detail and --compact honoring. Output is a
structured JSON document — top-level version / title / detail /
compact / sessions / messages keys; each node carries
index / type / title / timestamp / session_id / content,
plus optional parent_uuid / agent_id / pair_first etc. when
present. Children are nested directly under their parent's
children array — it's the same tree the HTML/Markdown renderers
walk, serialized verbatim.
The renderer runs entries through generate_template_messages (the
same format-neutral pipeline § 3 describes), so JSON output inherits
all post-factory polishing for free: slash-command normalisation
(bare <command-name>X</command-name> → /X), command-args
hardening, teammate session-color enrichment, etc. There is no
JSON-specific cleanup pass — the rule of thumb is: if it shows up
right in HTML/Markdown, it shows up right in JSON. This is the
operative example of the factory-layer normalisation seam: raw
TranscriptEntry data is polished once at factory time into the
typed MessageContent models that all three renderers share, so
display polish lives in one place rather than being re-implemented
per output format.
A few JSON-specific touches:
_json_defaultunwraps Pydantic models embedded inMessageContentdataclasses (tool inputs/outputs are Pydantic;dataclasses.asdictdoesn't recurse into them, so without this hook they'd stringify via__repr__and lose structure). Also handlesEnumandPath.is_outdated(file_path)reads theversionfield from existing JSON output and compares against the current library version — same invalidation contract as the HTML cache so re-runs skip unchanged outputs.combined_transcripts.jsonper project;session-{id}.jsonfor individual sessions. The naming respectsvariant_suffixfor detail/compact variants.
The projects-index JSON (all-projects-summary.json) is a parallel
top-level file — same shape as HTML's index.html but consumable by
external tools (dashboards, query scripts, jq pipelines).
2.6 Detail-level filter¶
The --detail flag (and models.DetailLevel) lets users dial down
how much of the transcript renders:
full(default) — everything.high— detailed but cleaned: drops system/hook noise while keeping the full conversation and tool I/O.low— drops most tool I/O, keeps the conversation plus a curated set of "interaction signal" tools (WebSearch, WebFetch, Task, Agent — the ones that show what the agent did, not what it read). See_LOW_KEEP_TOOLSinrenderer.py.minimal— drops all tool I/O.user-only— drops everything except user messages and steering (designed for feeding to downstream agents, e.g. building a requirements doc).
Filtering happens in two passes: a pre-render pass on TranscriptEntry
that strips content items (e.g., tool_use blocks from assistant turns),
and a post-render pass on TemplateMessage that drops whole content
types created by factories (BashInputMessage, BashOutputMessage,
CommandOutputMessage at low/minimal). The two-pass shape exists
because some content is identifiable only after factory dispatch (e.g.,
distinguishing BashInputMessage from the tool_use that produced it).
Important interaction: _pair_skill_tool_uses runs before
_filter_template_by_detail, and each pass that drops messages calls
_reindex_filtered_context to remap surviving indices (the skill-fold
pass remaps after dropping the slash-command body and the redundant
"Launching skill" tool_result; the detail filter remaps after dropping
content types below FULL). The reindex pass also has to update
cached parent-message references on SessionHeaderMessage (see PR
131 fix). See rendering-architecture.md § 5¶
for the full pass order.
2.7 Image export¶
image_export.py is
format-agnostic: HTML and Markdown both call into it. Three modes
(matching the --image-export-mode CLI choices):
placeholder— drop the image and render a placeholder marker in its place.embedded— base64-encode the image directly into the output as a data URL.referenced— write the image to disk next to the output and embed asrc=reference.
Default is embedded for HTML (single self-contained file) and
referenced for Markdown (keeps the .md text small and lets
images live as separate PNGs alongside).
2.8 Performance profiling¶
renderer_timings.py
provides log_timing(label, t_start) context managers used throughout
renderer.py. Set CLAUDE_CODE_LOG_DEBUG_TIMING=1 to print per-phase
times to stderr — useful for spotting which phase regressed when a
large transcript suddenly takes seconds longer than before.
2.9 Diagnosing hangs (SIGUSR1 stack dump)¶
When claude-code-log appears stuck (100% CPU, no output), a
single SIGUSR1 to the running process dumps the live Python
stack of every thread to stderr without killing it:
The handler is wired in cli.py::_install_stack_dump_signal() via
faulthandler.register(SIGUSR1, all_threads=True, chain=False) and
installed before any heavy work in the entry point. POSIX-only —
Windows lacks SIGUSR1, the install is a silent no-op there. Unlike
py-spy, this needs no root and no extra install, since the runtime
is already wired to dump itself on demand. Added by PR #135 to make
the DAG cyclic-children class of bug diagnosable in the field; useful
for any future hang.
3. Data lifecycle¶
┌──────────────────┐
│ JSONL file(s) │
│ (~/.claude/...) │
└────────┬─────────┘
│
parser.py + factories/
│
▼
┌───────────────────────┐
│ list[TranscriptEntry] │ (typed Pydantic models)
└───────────┬───────────┘
│
factories/ dispatch
│
▼
┌─────────────────────────┐
│ list[TemplateMessage] │ (each carrying a typed
│ with MessageContent │ MessageContent variant)
└─────────────┬───────────┘
│
renderer.py (generate_template_messages):
build DAG → pair → reorder → relocate
subagent blocks → build hierarchy →
cleanup sidechain dups → populate caches
│
▼
┌──────────────────────┐
│ Tree of TemplateMsg │
│ + RenderingContext │ (caches: teammate_colors,
│ + nav data │ task_subjects, etc.)
└──────────┬───────────┘
│
┌────────────┬─────────────┴─────────────┬────────────┐
▼ ▼ ▼ ▼
html/renderer.py markdown/renderer.py json/renderer.py
│ │ │
▼ ▼ ▼
index.html + *.md combined_transcripts.json
session-*.html (single file) session-*.json
all-projects-summary.json
│ │ │
└──────────────────┼──────────────────────┘
│
┌──────────┴────────────┐
▼ ▼
cache.py image_export.py
(SQLite) (HTML / Markdown only —
JSON serialises paths)
Cache reads/writes happen in parallel with the main pipeline:
cache.py is consulted before parsing (cache hit → skip parse), after
rendering (write the rendered HTML), and during TUI navigation (the
TUI never re-parses).
4. Cross-cutting glossary¶
Terms that appear across multiple subsystems — defined once here.
-
TranscriptEntry: typed Pydantic model for a single line in the source JSONL. Variants:
User,Assistant,Summary,System,Passthrough,QueueOperation. Seeparser.pyandmodels.py. -
MessageContent: render-time content variant produced by the factories from
TranscriptEntry. Many flavours (UserTextMessage,ToolUseMessage,TeammateMessage, …). OneTranscriptEntrymay yield multipleMessageContents (a single assistant turn with N tool_uses produces N+1 messages). See messages.md for the full taxonomy. -
TemplateMessage: the render-time wrapper around a
MessageContent. Carriesmessage_index, parent/child links, pair_first/pair_middle/pair_last, ancestry, and the renderer-format CSS classes. Defined inrenderer.py. -
RenderingContext: mutable cache attached to one render pass. Holds the message registry plus nested per-session caches (
teammate_colors,task_subjects,task_id_for_tool_use,session_first_message, etc.). Caches are session-scoped because combined-transcripts mode merges multiple sessions and per-session identifiers (teammate_id, task_id) aren't globally unique. -
session_id: the JSONL's
sessionIdfield. Often a UUID string. In some renderer paths a synthetic form is used: {trunk}#agent-{agentId}for sub-agent transcripts (so they form a separate DAG-line attached to their spawning trunk).-
{trunk}@{first_uuid_prefix}for branch sessions (rewinds / parallel-tool_use forks). See dag.md. -
render_session_id: the session id that should be used when walking
ctx.messagesto find content for rendering, accounting for synthetic rewrites. -
sidechain: a sub-agent's transcript entries are flagged
isSidechain: true. The DAG layer integrates them into the parent session's tree under the spawning Task/Agent tool_use anchor. See agents.md, dag.md. -
agent_id: identifier copied from a Task/Agent tool_result (either
toolUseResult.agentIdor parsed from the Markdown metadata tail). Used to stitch sub-agent JSONL files into the trunk DAG. See agents.md. -
fork point / branch: when a session has multiple children with the same parent, the parent is the fork point and each child initiates a branch. Real forks come from
/exitrewinds; spurious forks (parallel tool_uses, structural-only siblings) are collapsed by_walk_session_with_forks. See dag.md. -
SessionHeaderMessage: the synthetic content type produced for every session boundary in the rendered output — the header that appears above each session's first real message. Two flavours: trunk headers for top-level sessions, and branch headers for fork branches (the "branch heading" you'll see referenced in bug reports). Both headers are constructed by
_build_trunk_header/_build_branch_header(inrenderer.py); the branch header's title is composed by_branch_labelin the shapeBranch • <uuid8> • <preview>, with the preview computed once by scanning the branch's DAG-line uuids for the first user entry with text (viaextract_text_contentinparser.py+create_session_previewinutils.py, which callssimplify_command_tagsto strip raw<command-name>XML soup down to/cmd). When troubleshooting branch-heading rendering, those are the functions to inspect. -
pair_first / pair_middle / pair_last: a pair of messages rendered as one logical unit (tool_use + tool_result, Slash + UserSlash, thinking + assistant).
pair_middleexists for triples — currently the slash-command(UserSlash → Slash → CommandOutput)shape. -
detail level: see § 2.6.
-
detail-aware tools: the curated set of tools whose I/O survives
--detail lowbecause they convey what the agent did, not what it read (WebSearch,WebFetch,Task,Agent). -
passthrough: a
PassthroughTranscriptEntryis a non-conversation entry (hook callbacks, progress updates, last-prompt markers). The DAG layer keeps them in the structure but the renderer typically hides them.
5. Where to start reading¶
Common entry questions and their best first stop:
- "How does a JSONL line become an HTML row?" → rendering-architecture.md.
- "Why are forks rendered weirdly / what is a branch session?" → dag.md.
- "What message types exist and what do they look like?"
→ messages.md plus the samples in
messages/. - "I want to add support for a new Claude Code tool." → implementing-a-tool-renderer.md.
- "I want to write a third-party plugin (e.g. for an MCP tool we don't ship)." → plugins.md.
- "How does folding / collapsible content work?" → message-hierarchy.md.
- "What CSS classes does a message div get?" → css-classes.md.
- "How are sub-agent transcripts (sync, async, teammates) integrated?" → agents.md, then teammates.md for the teammates-specific machinery.
- "I want to extend the cache / change the schema." → § 2.3, § 2.4 here, then read the migration files in order.
- "How do I export to JSON for downstream tooling?"
→ § 2.5 here (and
--format jsonfrom § 2.1). - "claude-code-log is hung — how do I see what it's doing?"
→ § 2.9 (
SIGUSR1stack dump). - "What's planned but not implemented?"
→
work/— each.mdis an in-flight or proposed plan.