Use Cases
Understand a new codebase
You join a team. The repo has 200 files across 15 modules. Instead of reading code file by file:
cd the-repo && llm-wiki .
llm-wiki query gods # what are the core abstractions?
llm-wiki query community 0 # what's in the biggest cluster?
llm-wiki query path Auth Database # how does auth reach the DB?
Open graph.html to visually explore — click nodes, filter by community, trace paths.
Build a research corpus
Drop papers, notes, tweets, blog posts into a folder. Build once, query across all of them:
llm-wiki add https://arxiv.org/abs/2401.12345 --author "Smith et al."
llm-wiki add https://karpathy.ai/blog/llm-wiki
llm-wiki .
llm-wiki query search "attention mechanism"
llm-wiki query path "transformer" "diffusion"
The graph shows connections between papers you didn’t know were related.
Personal /raw folder (Karpathy workflow)
The original concept: one folder where you drop everything.
~/raw/
papers/attention-is-all-you-need.pdf
notes/meeting-2026-04.md
screenshots/architecture-whiteboard.png
code/prototype/main.py
cd ~/raw && llm-wiki .
Code gets AST extraction. Docs get structural parsing. Use /wiki . in Claude Code for deep extraction of PDFs and images.
Vietnamese historical documents
Tested with Thượng Chi Văn Tập (Phạm Quỳnh) and Nam Phong Tạp Chí:
- DOCX (43K chars Vietnamese text): 3 hub nodes → 103 entities after agent extraction
- Scanned PDF (1007 pages): 0 text via pypdf → 86 entities after agent vision
- HEIC scans (13 images): 0 text → 29 entities with character names, places, themes
Agent mode preserves Vietnamese labels and extracts culturally specific entities.
Documentation audit
Run on your docs folder to find:
- God nodes — most referenced concepts (are they well-documented?)
- Orphan nodes — concepts with no connections (missing cross-references?)
- Surprising connections — unexpected links between docs
- Community structure — do your docs cluster the way you expect?
llm-wiki query gods # what's most referenced?
llm-wiki query stats # how much is INFERRED vs EXTRACTED?
Living wiki for a team
The graph compounds over time. Each developer session adds knowledge:
# Day 1: initial build
llm-wiki .
# Day 5: new files added, rebuild (cache skips unchanged)
llm-wiki .
# Day 10: file an insight from debugging session
mkdir -p wiki-out/ingested
echo "# SharedConnection: GraphStore and MemoryStore share SQLite connection pool" \
> wiki-out/ingested/insight_shared_connection.md
llm-wiki . # insight now in graph
# Weekly: health check
llm-wiki lint
Claude Code integration
After building, Claude answers questions without re-reading source files:
# Install skill
mkdir -p ~/.claude/skills/my-llm-wiki
cp "$(python -c 'import my_llm_wiki; print(my_llm_wiki.__path__[0])')/SKILL.md" ~/.claude/skills/my-llm-wiki/
# /wiki . → structural + agent extraction (DOCX, PDF, images)
# /wiki lint → graph health check
# "what connects GraphStore to Settings?" → Claude uses llm-wiki query
The skill implements the full living wiki cycle — monitor, rebuild, lint, write-back.