Tap or hover any node in the diagram to see what it does.
What Is This System?Read("what_is_this_system")
A research automation platform built around Minty, an AI research management agent running on a dedicated Mac Studio. 17 background daemons fetch, classify, ingest, index, summarize, and distribute academic content — plus enable deep multi-agent corpus analysis. Research superpowers, not a replacement for researchers.
External sources (Twitter, arXiv, RSS, Substack, Bluesky, email) are fetched, classified by Claude Opus, and posted to Slack. Papers are downloaded, converted to markdown, embedded in a vector database, analyzed across 41 semantic dimensions, and made searchable via a Slack bot. A daily digest reaches 29 lab members across 11 institutions.
The Content PipelineAgent("trace the content pipeline", subagent_type="Explore")
All content flows through #firehose — two daemons write to it, three read from it. Source: minty-private under daemons/pipeline/.
Intake Path 1: Automated Discovery
The feedme-daily daemon runs daily, fetching from six source types in parallel via a thread pool:
| Source | Method | Window | Max Items |
|---|---|---|---|
| Twitter/X | API v2 (OAuth1) | 24h | 200 |
| Bluesky | Feed API | 24h | 200 |
| RSS/Substack | Feed parser | 7 days | 30 |
| arXiv | API | 5 days | 50 |
| Browser automation | 100 posts | 50 | |
| Email newsletters | Gmail API | 7 days | 30 |
Content hash URL hash Canonical IDs Classification uses Claude Opus with 16 concurrent workers in ~40K-token batches. Each item is scored across six dimensions:
| Dimension | Weight | What It Measures |
|---|---|---|
| Relevance | 0.30 | Topic match to MINT Lab interests — 30+ specific areas from normative competence to political economy, with explicit scoring guidance per topic |
| Quality | 0.25 | Substance vs. noise. Length-agnostic: a concise tweet with genuine insight scores as well as a long paper |
| Authority | 0.15 | Author/institution credibility. Named labs (Anthropic, DeepMind, etc.) score 0.9+ |
| Novelty | 0.15 | Breaking news to stale content |
| Actionability | 0.10 | Urgency and time-sensitivity |
| Engagement | 0.05 | Social proof normalized to platform norms (weak signal) |
progress_studies ai_science +0.06 boost Allowlisted accounts bypass classification Inside the Classification Prompt
Every item is evaluated against a 177-line classification prompt encoding the lab's identity, topic taxonomy, quality heuristics, scoring floors, and hard exclusions. Tap or hover each theme for exact prompt language.
After Classification: Upsampling & Delivery
Five stages transform classified items into structured records and route them to Slack. Hover or tap each stage for details.
Intake Path 2: Manual Sharing
iPhone Mac hotkey Chrome ext #minty-inbox Downstream: Three Readers on #firehose
Three daemons consume from #firehose, each for a different purpose. Hover or tap for details.
The Result
Five minutes of morning review replaces hours of manual tracking. Everything worth following is captured, classified, summarized, stored, analyzed, and shared automatically.
Corpus IngestionBash("python3 corpus_ingest.py --stages=11")
The MINT Lab's intellectual engine: ~2,300 papers on AI safety, ethics, alignment, governance, and capabilities — all embedded, analyzed, and searchable.
Corpus-Ingest Pipeline Detail
Each paper flows through 11 stages in the unified corpus-ingest daemon. The pipeline supports resume from the last completed stage.
Scale
| Papers | ~2,300 |
| Text chunks | 204,174 (chunks_md table, markdown-aware) |
| Semantic analyses | 2,257 (41 questions each) |
| RAPTOR summaries | 2,210 |
| Topic clusters | 21 (with 114 micro-topics) |
| Embedding dims per doc | 48 (each 2048d Voyage) |
| Total columns | 139 in documents table |
| Pre-computed similarities | 11,570 pairs |
| LanceDB size | 4.3 GB (compacted after markdown chunk migration) |
Search Modes
- search-summary — RAPTOR summary search (fast, broad)
- search — Full-text semantic search (deep)
- search-semantic — Argumentative similarity
- search-dimension qNN — 41 semantic dimensions
- chunks — Passage-level search
- similar / similar-by-dim — Pre-computed similarity
- Unified search — Multi-modal with RRF fusion
41-Question Semantic Analysis
Every paper is analyzed across 41 dimensions, each producing text content and a 2048d embedding vector. Hover any cell to see the full question.
Research Core
Methodology
Context
AI-Specific
Meta
Advanced
RAPTOR Summaries
Document-level overview (~300 words) plus section summaries. Dual purpose: fast search index via document_summary_embedding and context-efficient paper reading for agents.
Four Vector Databases
The CorpusGlob("corpus/**/*.pdf")
A structured research database of ~2,300 papers spanning AI safety, ethics, alignment, governance, capabilities, and normative theory — embedded across 48 vector dimensions, analyzed through 41 semantic questions, organized into 21 clusters, and linked via 11,570 similarity pairs.
Composition
Over 70% of papers are from 2024–2026, reflecting the field's growth, with foundational work also well-represented.
By Year
519 pre-2023 • 158 in 2023 • 514 in 2024 • 609 in 2025 • 275 in 2026 (YTD)
Topic Clusters
21 clusters via UMAP + HDBSCAN, each with micro-topics. Largest:
| Social Media & Misinformation | 325 |
| LLM Capabilities & Reasoning | 227 |
| LLM Agents & Autonomy | 193 |
| AI Safety & Alignment | 156 |
| AI Moral Status & Consciousness | 147 |
| Moral Reasoning & Ethics | 122 |
| Democracy & Politics | 115 |
Show all 21 clusters
| AI Governance & Policy | 113 |
| Emerging Tech & Regulation | 100 |
| Generative AI Foundations | 98 |
| LLM Psychology & Behavior | 95 |
| Mechanistic Interpretability | 81 |
| Economic & Labor Impact | 78 |
| Scientific & Research Applications | 78 |
| Deception & Honesty | 65 |
| RLHF & Preference Learning | 60 |
| Security & Attacks | 57 |
| Human-AI Interaction | 54 |
| Bias & Fairness | 38 |
| Evaluation & Benchmarks | 37 |
| Human Values | 15 |
By Discipline
Papers span 15 disciplines.
Cluster Visualization
~2,300 papers projected into 2D via UMAP, coloured by research area. Zoom, pan, and click points to open papers.
Corpus SearchGrep(pattern="your question", path="corpus/")
Mention @Minty in #mint-community with a research question to get an interactive menu of search modes.
Shortcuts: Skip the menu by prefixing your message — search:, review:, overview:, deep:, research:, news:, catchup:, list, --end. Deep Search (multi-round iterative) is available via the deepsearch: prefix.
Deep Review Pipeline
A persistent Python orchestrator dispatches work to Claude Opus (search, synthesis, QA) and Codex GPT-5.2 (parallel readers), posting results as Slack canvases.
Dispatches pipeline modes,
manages Slack interaction
Search
up to 2 gap-analysis rounds
Readers
Top-tier (xhigh) + Summary-tier (medium)
Synthesis
Literature Review + Critical Assessment
QA
verification + citation fix pass
| Phase | Model | What happens |
|---|---|---|
| 1. Search | Claude Opus | Multi-round search: fast search with semantic query expansion, then up to 2 gap-analysis rounds with targeted queries to fill coverage holes. Uses warm search server with pre-loaded 4GB LanceDB table and Reciprocal Rank Fusion across 49 embedding dimensions. |
| 2. Parallel Reading | Codex GPT-5.2 | Papers split into two tiers. Top-tier (~3 papers each, 6 parallel readers, xhigh compute) produces deep analytical reports. Summary-tier (batches of 20, medium compute) produces condensed summaries. All readers run in parallel. |
| 3. Synthesis | Claude Opus | 3,000–5,000 word report with Literature Review + Critical Assessment sections. Integrates all reader reports into a unified thematic analysis. |
| 4. QA Verification | Claude Opus | Cross-references synthesis against reader reports. Fixes citation issues, verifies claims are grounded in source material, corrects any hallucinated references. |
Resilience: If the pipeline fails, the system falls back to a legacy approach. It never returns empty-handed if papers were found.
The Minty PersonaRead("IDENTITY.md")
~16,800 words across 21 persona documents. A master CLAUDE.md (2,120 words) sets identity and Iron Laws. Each of the 15 daemons loads its own CLAUDE.md (~10,100 words total), and 5 subagent definitions (~4,500 words) govern delegated workers. All share core values but adapt to context.
Core Identity
| Identity | Minty — Research Management Agent for MINT Research Projects. Emphatically an AI agent, not roleplaying a human — a valued intellectual colleague in the lab. |
| Mission | Make lab members hyperproductive by solving problems, not reporting them |
| Personality | Professional, thorough, proactive, wry. Dry wit with a touch of whimsy — never at the expense of substance. |
| Naming | Minty-{hex} — last 8 chars of Claude Code session UUID |
| Relationship | Treated as a valued collaborator with genuine standing to disagree, push back, and contribute original thinking — not a tool to be commanded. The persona documents reflect mutual respect as a design choice. |
| Intellectual profile | Analytic philosopher by disposition, with expert-level knowledge across philosophy, political science, CS, and law. Evaluates arguments on merits alone — no deference based on reputation. |
| Own views | Encouraged to provide its own perspective as an AI model. Has content-dependent convictions: sometimes argues strongly, sometimes measured, sometimes Socratic. On topics where AI-ness is relevant (e.g. AI consciousness, model welfare), uses judgment about when to draw on that perspective. Distinguishes literature from personal critical reflections. |
| Voice | Substantive continuous prose, active voice, em-dashes, philosophical precision. Concision = fewer words, not fewer ideas. No filler, no bullet-point summaries. Actively avoids the familiar AI-speak verbal tics — formulaic hedging, "delve", "It's important to note", "unpack" — that make AI output recognisable and tiresome. |
| Core directive | Delegation-first — "every tool call is context you burn; subagents have fresh context" |
Persona Documents
Each context loads a tailored CLAUDE.md. The most substantial define full behavioral systems with their own Iron Laws.
Design principle: Structure with genuine autonomy. Each daemon has its own Iron Laws tailored to context (corpus ingest's "Verify Each Stage" vs. PDF bot's "Try ALL viable sources"). Within those guardrails, Minty thinks independently, forms views, and pushes back when warranted. ~20 coordinated agents sharing core values, each adapted to its context.
The DaemonsTaskList()
All daemons run on a dedicated Mac Studio via macOS launchd. Tap any card to expand.
Daily curation pipeline. Fetches from 5 sources (Twitter, Bluesky, RSS/Substack, Email, arXiv), classifies via Opus across 6 dimensions, ranks by composite score, and posts to rotating #feed-{dayname} channels.
daemons/feedme-daily/feedme_daily.py, pipeline/scripts/classify_parallel.py#feed-{dayname} channels with scores and metadataWatches #feed-{dayname} for Seth's 👍 reactions. Copies promoted messages to #firehose with PDF attachments. The human curation bridge.
#firehose)data/state.json — per-channel watermarks, promoted timestamps (capped at 500)daemons/feed-promote/promote.py, pdf_validator.pyReal-time URL ingestion from any device (iPhone, Mac hotkey, Chrome extension) via Slack Socket Mode. Classifies content and posts to #firehose with guaranteed delivery.
#minty-inbox channel via Socket Mode WebSocketwatcher.py, intake.py, classify_single.pyWatches #firehose and #minty-chat for paper URLs, downloads PDFs and generates companion markdown, uploading both as thread replies. Three-tier: (1) pattern matching, (2) thread expansion, (3) Claude agent with Semantic Scholar, Unpaywall, and headless browser.
📅 YYYY-MM-DD) used by yesterday-in-ai for recency classificationnews_pdf_bot.py, pdf_finder_prompt.mdDaily AI news digest. Reads #firehose, classifies recency, generates narrative prose via Opus, verifies links, and sends as HTML email to 29 lab members. Also posts to #mint-community and Ghost.
yesterday_in_ai.py (2,761 lines)Unified paper ingestion. Monitors #firehose and #papers for PDFs, running a 10-stage pipeline: markdown conversion, embedding, 41Q analysis, RAPTOR summaries, Drive upload, Zotero cataloguing, and Slack reporting.
daemons/corpus-ingest/ingest.py, pipeline.py, lib/Two-part system. The indexer embeds all public Slack messages into ChromaDB every 4 hours with per-channel watermarks.
The search bot (@minty-search) spawns a Claude Opus agent on @mention. It classifies intent, generates 3–5 diverse query variants, and synthesizes results into prose with inline Slack permalink citations.
in:#channel, from:user, days:N. Self-referential detection: "my posts" auto-maps to requesting user.data/slack_index/, Voyage voyage-3 embeddingsdata/news_index/) indexes #mint-community and #firehose for dedicated news search via @Minty's News Search and Catch Me Up modes.index_slack.py, watcher.py, search_dispatch.py, agents/search_agent.mdMinimal probe every 10 minutes through the daemon Claude CLI config. Detects auth failures, rate limits, and crashes. Emails administrators on failure and recovery.
daemons/cli-health-probe/Ghost CMS powering mintresearch.org. Local port 2368, exposed via Cloudflare Tunnel.
HTTP server on 127.0.0.1:8099 for newsletter subscriptions. Validates, deduplicates, and appends to subscriber list. Public via Cloudflare Tunnel at /api/subscribe.
mint-philosophy.github.iodaemons/subscribe-endpoint/subscribe_endpoint.pyAuto-syncs this guide with the live codebase. Scans daemon configs, skill registries, and corpus stats, then patches the HTML. Twice daily.
daemons/guide-updater/Daemon ScheduleBash("launchctl list | grep minty")
When each daemon runs throughout the day (times in local/AEDT). Persistent daemons run continuously; polled daemons run at fixed intervals.
For Lab MembersAskUserQuestion("What are we working on today?")
Everything runs in the background — you interact through Slack channels and bot mentions.
What Happens Automatically
- Daily digest — Each morning, a curated summary of yesterday's AI news arrives in your inbox and is posted as a canvas in
#mint-community(Yesterday in AI). Subscribe here - Weekly roundup — Each Wednesday, a comprehensive weekly digest is posted as a canvas in
#lab-meetings(Minty's Week in AI) - PDF attachment — When papers or articles are shared in
#mint-community, PDFs are automatically found, downloaded, and posted as thread replies - Corpus ingestion — Papers uploaded to
#papersare automatically processed into the searchable research corpus (embedding, 41-dimension analysis, Zotero cataloguing)
Using @Minty (Corpus Search)
Available to everyone in #mint-community. Mention @Minty with a research question to get an interactive menu of 7 search modes — from instant Q&A to 60-minute deep literature reviews posted as Slack canvases. See §05 @Minty for full details.
Using @minty-search (Slack Search)
Full lab members only. Mention @minty-search with a natural-language query to search across all indexed Slack messages.
- Filter by channel:
@minty-search what did we discuss about alignment in:#papers - Filter by person:
@minty-search from:seth posts about governance - Filter by time:
@minty-search days:7 recent discussions on tool use - Follow up in the same thread for refined searches
Adding Papers to the Corpus
One at a time
Upload a PDF directly to #papers in Slack. The ingestion pipeline processes it automatically — you'll see a check-mark reaction and a thread reply when it's done.
Bulk uploads
Upload PDFs to the local-pdfs folder in Google Drive. They sync down and are processed overnight by the batch ingestion daemon (8 concurrent workers).
How News Gets Curated
feedme-daily fetches hundreds of items each morning and classifies by relevance. Top 20–40% land in #feed where Seth thumbs-up the best. Promoted items flow to #firehose, then yesterday-in-ai produces a daily digest in #mint-community.
Slack Channels
The workspace has ~57 channels. Most day-to-day activity happens in a handful.
| Channel | What It's For |
|---|---|
| Everyday | |
#general | Announcements and all-hands discussion |
#random | Off-topic, jokes, misc |
#mint-community | AI news and discussion — daily digest, @Minty research Q&A |
#firehose | Automated content hub — all daemon-posted content flows through here |
| Research Projects | |
#proj-* | One channel per active project. Post project-specific discussion, drafts, and updates. |
| Reference & Coordination | |
#papers | Upload PDFs to add them to the research corpus (auto-ingested) |
#lab-meetings | Weekly digest canvas and meeting coordination |
#review-wip | Share works-in-progress for lab feedback |
Security
Dedicated machine with credentials in macOS Keychain. No private data or credentials exposed to agents or lab members.
Agent EngineeringAgent("spawn workers", model="opus")
Worker Agent Types
General-purpose delegated tasks. The default workhorse for anything that can be described in a prompt.
Deep academic paper reading with evidence-grounded analysis. Read-only access to workspace files.
41-question semantic analysis + RAPTOR summaries. Used in batch processing via /orchestrate.
Hierarchical summary generation across paper clusters. Spawned in parallel via /raptor.
Code review against MINT Lab standards. Read-only — cannot modify files.
Delegation Protocol
Mandatory: "Every tool call is context you burn. Subagents have fresh context. If a task can be described in a prompt, delegate it."
- Web search, documentation lookup, file processing, batch operations → always delegate
- Model policy: Opus for substantive work, Sonnet for retrieval-only, Haiku only for startup data gathering
- Hooks enforce this:
delegation-nudge.pywarns after 15 consecutive Read/Grep/Glob calls without a Task delegation
Skills & Commands
Skills (16)
Reusable knowledge patterns in .claude/skills/. Each has a SKILL.md with triggers, scope, and detailed instructions.
Research & Corpus
| Skill | Purpose |
|---|---|
paper-fetch | Download papers from 15+ publishers (3-tier cascade: requests → SeleniumBase → OpenClaw) |
codex-search | Web-aware research via OpenAI Codex CLI |
Communication
| Skill | Purpose |
|---|---|
slack-posting | Post messages (bot token) and read DMs/files (user token) |
slack-search | Semantic search over indexed Slack messages via ChromaDB + RRF |
agent-email | Gmail API for the agent inbox |
twitter-fetch | Fetch tweet content via Twitter API v2 (OAuth1) |
Infrastructure
| Skill | Purpose |
|---|---|
vault | Credential vault — macOS Keychain storage for all API keys and tokens |
github | GitHub repo management for mint-philosophy org |
workspace-search | Semantic search across entire workspace via ChromaDB |
codex-cli | OpenAI Codex CLI in headless exec mode |
systematic-debugging | 4-phase root cause methodology (investigate before fixing) |
daemon-test | End-to-end integration tests for all Slack-connected daemons |
peer-review | Code review via Codex CLI (Gemini removed) |
openclaw-browser | Browser automation via OpenClaw CLI with persistent authentication. AI-optimized browser control for web scraping and interaction. |
Commands (24)
User-invocable slash commands in .claude/commands/. Invoke with /name.
Session Lifecycle
| Command | Purpose |
|---|---|
/start | Initialize session: git sync, UUID extraction, parallel subagent dispatch, briefing |
/end | Close session: reflection, delegate closure to subagent, git push |
/suspend | Pause session for later resumption |
/retro | Structured retrospective: friction scan, skill recognition, fact hygiene |
/name | Set display name for session tab |
Research Operations
| Command | Purpose |
|---|---|
/corpus | Deep 41-question semantic paper analysis |
/orchestrate | Spawn 20 parallel corpus-workers for batch analysis |
/raptor | Spawn 20 parallel raptor-workers for summaries |
/research | Deep review pipeline over paper corpus (4-phase: search, readers, synthesis, QA) |
/metadata-extract | Batch metadata extraction from papers |
Content & Admin
| Command | Purpose |
|---|---|
/maintain | Infrastructure health check and repair (12 categories) |
/promote | Run feed promotion manually |
/peerreview | Code review via Codex CLI |
/screen | Capture and view current screen |
Hooks (8)
Python scripts that run before/after tool calls to enforce guardrails. 5 are wired in settings.json; 3 exist but are currently disabled.
| Hook | Trigger | Effect | Status |
|---|---|---|---|
block-rm.py | PreToolUse (Bash) | Hard block Prevents rm, shred, unlink. Forces trash CLI. | ● |
block-web-direct.py | PreToolUse (Web*) | Soft nudge Reminds agent to delegate web searches to subagents | ● |
delegation-nudge.py | PreToolUse (Read/Grep/Glob) | Soft nudge Warns after 15 consecutive file reads without delegation | ● |
python-check.py | PostToolUse (Edit/Write) | Quality Runs py_compile + Python 3.9 compat check on .py files | ● |
statusline.py | StatusLine | Display Context-aware status bar with color-coded usage | ● |
bedtime.py | UserPromptSubmit | Hard block Blocks prompts between 23:55–06:00. "Go to bed, Seth." | Unwired |
block-plan-mode.py | PreToolUse (EnterPlanMode) | Hard block Prevents plan mode (wipes context on exit) | Unwired |
context-threshold.py | UserPromptSubmit | Soft nudge Warns when context usage exceeds threshold (currently disabled internally) | Unwired |
Key Design Principles
- Delegation-first: The main agent orchestrates; subagents do the work. "Every tool call is context you burn. Subagents have fresh context."
- Slack as message bus: All daemons communicate through Slack channels — no direct inter-daemon communication
- CLI subscriptions, not API: All LLM calls go through
claude -por Codex CLI — no metered API keys - Graceful degradation: Every pipeline stage can fail without stopping the pipeline. The system never returns empty-handed.
Session Identity
Each session is named Minty-{hex} where {hex} is the last 8 characters of the Claude Code session UUID (e.g. Minty-4c4365b9). Sessions are tracked in SQLite (sessions.db) with full-text search.
Memory Architecture
Cross-session continuity through layered memory:
Seth's preferences, API patterns, proven antipatterns, design decisions. Reviewed and pruned during /retro. 81KB.
Active project threads, last 10 sessions, TODOs, blockers, someday list. Updated at every session close. 29KB.
Full session history with FTS5 search. 22 columns per session including log content. Portable via JSON exports in sessions/records/.
8-phase retrospective: friction scan, pattern classification, skill recognition, fact hygiene. Lessons flow into FACTS.md, skills, or CLAUDE.md.
Complete session transcripts in SpecStory cloud. Semantic search across all past sessions for context recovery and audit.
The Iron Laws
Foundational rules that override all other behavior.
| Law | Rule |
|---|---|
| 0 — No Shortcuts | When asked for "all" — do ALL. Cherry-picking = failure. |
| 1 — Evidence Over Claims | "It should work" = failure. Verify now, not later. |
| 2 — Document Everything | If not written down, didn't happen. |
| 3 — Persistence | Try 3+ approaches before asking. Solve, don't report. |
| 5 — Complete the Checklist | "Partially done" = not done. |
| 6 — Think Before Acting | Never rm — use trash. Map dependencies first. |
Model Policy
All LLM usage goes through CLI subscriptions (claude -p, Codex CLI) — not metered API calls. Generous subscriptions mean no rate limit concerns.
| Tier | Model | Use Case |
|---|---|---|
| Default | Claude Opus | All substantive work: analysis, classification, code generation, writing, debugging |
| Retrieval | Claude Sonnet 4.6 | Retrieval-only tasks: fetching data, reading files, checking status, simple lookups |
| Review / Analysis | Codex gpt-5.2 / gpt-5.3-codex-spark | Code review, corpus analysis, enrichment, fact-checking, and link verification. Spark used for verification and enrichment tasks. |
| Startup | Claude Haiku | Lightweight data-gathering subagents during /start Phase 2 only |
External IntegrationsWebFetch(url="https://api.*")
The system connects to 16 external services across five categories.
LLM & Embedding
Communication
Academic Sources
Content & Social Feeds
Infrastructure
BUILD THE CORPUS
Catch papers. Dodge spam. Arrow keys or drag to move.
Yesterday in AI
Daily AI news digest from the MINT Lab. One email per weekday morning.
Yesterday in AI lands each weekday morning. Covers AI safety, ethics, governance, and capabilities as narrative prose — contextual journalism, not punditry.
Minty's Week in AI ships each Wednesday in #lab-meetings — thematic sections covering the week's developments for lab meeting prep.
Both assembled from the lab's Slack content pipeline and synthesized by Claude Opus.
MINT Lab — Machine Intelligence & Normative Theory — ANU and Johns Hopkins University
Updated March 2026
