Threads
Capture, browse, search, and distill conversations from any AI tool.
Threads are your conversation layer. They keep the original flow of what happened in an AI session: what you asked, what the tool answered, and how the work evolved.
Their real value is not just storage. Threads become useful when you can search past conversations, reopen exact context, and distill the durable parts into memories.
Use threads when you need the full history
If you need the original messages, save or import a thread. If you only need the durable takeaway, distill it into memories and work from there.
The First Useful Thread
If you are new, do one of these first:
- import one conversation you already care about
- let one supported tool capture a full session
- capture one web conversation through the browser extension
Then open that thread and distill one useful memory from it. That is the core workflow.
| I want to... | Jump to |
|---|---|
| Browse and search threads | Browsing Threads |
| Distill threads into memories | Thread Distillation |
| Auto-import from coding agents | Auto-Sync |
| Capture from web AI chats | Browser Extension |
| Import my ChatGPT or DeepSeek conversations | Bulk Import |
Import a conversation file (.md with ## User/## Assistant headers) | Single Thread |
| Learn the .md conversation format | Conversation Markdown |
| Import via API | Import API |
| Import via CLI | CLI |
Browsing Threads
The Threads view shows all your imported conversations in one place.
- Search threads by content or title
- Filter by source (Claude Code, ChatGPT, Cursor, etc.)
- Pin important threads to keep them accessible
- View individual messages within any thread
Open Threads from the sidebar or press Cmd + 3 (macOS).
Thread Distillation
The key workflow that connects threads to memories. Open any thread and trigger distillation. The system extracts individual memories from the conversation, each with its own title, labels, and importance score. The extracted memories enter your knowledge graph and become searchable alongside everything else.
This is how hours of AI conversation become connected knowledge you can find later.
Also from the browser
The browser extension supports Smart Distill directly from web conversations, so you can capture memories without importing the full thread first.
How Threads Reach Mem
Threads can enter Nowledge Mem through several different paths. They are related, but they are not the same:
- Native plugins and extensions: tool-specific integrations like Claude Code, Gemini CLI, Droid, Cursor, OpenClaw, and Alma
- Local auto-sync: in-app discovery that watches supported coding-agent conversations on your machine
- Shared skills or prompt packs: reusable setups like
npx skillsor the Codex prompt pack - Browser capture: the Exchange extension for supported web AI chats
- Manual import: files, exports, API calls, and CLI imports
The main thing to understand is this:
- Full session capture means Mem receives the actual recorded conversation from that tool
- Handoff summary means Mem stores a concise continuation note instead of the full session
Shared skills matter here too. They are useful across many agents, but they cannot honestly promise full session capture unless that host runtime exposes readable session files or a stable transcript API.
Most users only need one rule:
- if your tool already has a real thread-save path, use it
- if it only supports handoff summaries today, keep that mental model clear and use import or auto-sync for full history
Auto-Sync
In-App Discovery
Scan your machine for conversations from local coding assistants. No file export needed.
Local-only discovery
This path scans conversation files on the machine running Nowledge Mem. It is excellent for local sync, but it is different from nmem t save --from ..., which reads local session files client-side and can still upload normalized threads to a remote Mem server.
| Client | Sync Mode | Where |
|---|---|---|
| Claude Code | Auto-discovery + incremental sync | Threads → Import → Find AI Conversations |
| Cursor | Auto-discovery + incremental sync | Threads → Import → Find AI Conversations |
| Codex | Auto-discovery + incremental sync | Threads → Import → Find AI Conversations |
| OpenCode | Auto-discovery + incremental sync | Threads → Import → Find AI Conversations |
Native Capture And Save Paths
Different integrations expose different thread-save behavior. Some support full session capture. Some auto-capture on lifecycle events. Droid and Cursor currently keep plugin handoff summaries separate from full conversation import.
| Integration | What it saves | How it works | Setup |
|---|---|---|---|
| Claude Code | Full session capture | Stop hook auto-saves the session when you exit. Also supports explicit /save. nmem reads local session files client-side before upload. | Claude Code guide |
| Gemini CLI | Full session capture plus separate handoff summary | save-thread imports the recorded Gemini session through nmem t save --from gemini-cli. save-handoff stays a separate resumable summary. | Gemini CLI guide |
| Droid | Handoff summary in the plugin | The Droid plugin intentionally exposes save-handoff, not save-thread. It provides Working Memory, routed recall, and resumable checkpoints now, while leaving transcript-backed thread save for a future real importer. | Droid guide |
| Cursor | Handoff summary in the plugin | The Cursor plugin intentionally exposes save-handoff, not save-thread. Use in-app discovery for local Cursor conversation import until a real live session importer exists. | Cursor guide |
| Alma | Full session capture | Optional auto-capture saves the session on app quit. | Alma guide |
| OpenClaw | Full session capture | Captures every agent session automatically at completion, with optional LLM distillation. | OpenClaw guide |
| Codex CLI | Full session capture | Explicit /save imports the recorded Codex session. | Codex CLI guide |
Generic npx skills agents | Handoff summary only | Use save-handoff. Shared skills can guide saving, but they do not control the host runtime well enough to promise transcript-backed import everywhere. | Integrations overview |
Full session capture vs handoff summary
If you need exact past conversation history, use a full-session capture or import path. Handoff summaries are for resumable continuity, not full conversation storage.
File Import
Bulk Import
Import all conversations from an export file at once.
| Source | File Format | How to Export |
|---|---|---|
| ChatGPT | chat.html | ChatGPT Settings → Data controls → Export data |
| DeepSeek | deepseek_conversations.json | chat.deepseek.com → Settings → Data → Export data |
| ChatWise | .zip (contains JSON files) | Export all chats from ChatWise app |
Single Thread
Import one conversation from a file.
| Format | File Type | Notes |
|---|---|---|
| Conversation Markdown | .md | ## User / ## Assistant / ## System headers, optional YAML frontmatter |
| Cursor | .md | Cursor's native export format (auto-detected) |
| Generic Markdown | .md | Any markdown file, imported as a document |
Documents vs. conversations
If your .md file is a regular document (no ## User / ## Assistant headers), it belongs in the Library, not Threads. Drag it into the Timeline or import from the Library view.
Conversation Markdown Format
The portable format for conversation import. Any tool that writes ## User / ## Assistant headers produces a file Nowledge Mem can read.
Minimal Example
The simplest valid file, two turns, no frontmatter:
## User
What is Python's GIL?
## Assistant
The Global Interpreter Lock (GIL) is a mutex in CPython that allows only one thread to execute Python bytecode at a time. This means CPU-bound multi-threaded programs won't see speedups from threading — use multiprocessing or async I/O instead.Full Example
With optional YAML frontmatter and a system message:
---
title: Python Async Patterns
source: chatgpt
date: 2025-06-15
---
## System
You are a senior Python developer who explains concepts clearly.
## User
How does async/await work in Python?
## Assistant
Python's `async`/`await` lets you write concurrent code that doesn't block while waiting for I/O. An `async def` function returns a coroutine, and `await` pauses it until the result is ready — meanwhile other coroutines can run.
## User
When should I use asyncio vs threading?
## Assistant
Use **asyncio** for I/O-bound work (HTTP requests, database queries, file reads) — it's lighter and scales better than threads. Use **threading** when you need to call blocking libraries that don't support async. Use **multiprocessing** for CPU-bound work.Format Rules
- Headers:
## User,## Assistant, or## System, level-2 heading, one per message - Content: Everything between headers is one message. Markdown formatting, code blocks, and lists are preserved as-is
- Frontmatter: Optional YAML block at the top. Supported fields:
title,source,date, all optional - Detection: Files with at least one
## Useror## Assistantheader are recognized as conversations automatically - Fallback: Files without recognized headers are imported as a single document message
- Case: Role names are matched case-insensitively (
## userand## Userboth work)
Import API
POST /threads/import accepts JSON messages or Conversation Markdown, with single and batch modes.
Single Thread (JSON messages)
curl -X POST http://127.0.0.1:14242/threads/import \
-H "Content-Type: application/json" \
-d '{
"title": "My Conversation",
"source": "chatgpt",
"messages": [
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hi there! How can I help?"}
]
}'Single Thread (Markdown)
curl -X POST http://127.0.0.1:14242/threads/import \
-H "Content-Type: application/json" \
-d '{
"markdown_content": "## User\n\nHello!\n\n## Assistant\n\nHi there! How can I help?"
}'Batch Import
curl -X POST http://127.0.0.1:14242/threads/import \
-H "Content-Type: application/json" \
-d '{
"threads": [
{
"title": "Thread 1",
"messages": [
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi"}
]
},
{
"title": "Thread 2",
"markdown_content": "## User\n\nGoodbye\n\n## Assistant\n\nSee you!"
}
]
}'Thread IDs are auto-generated when omitted. Titles are inferred from markdown frontmatter when available.
CLI
The nmem CLI supports thread import from files, JSON, or stdin.
# Import a conversation markdown file
nmem t import --file conversation.md
# Import with explicit title and source
nmem t import --file chat.md --title "Python Async" --source chatgpt
# Import from JSON messages
nmem t import --messages '[{"role":"user","content":"Hello"},{"role":"assistant","content":"Hi"}]'
# Pipe markdown from stdin
cat conversation.md | nmem t import --stdin --title "Piped Conversation"Run nmem t import --help for all options. See the CLI Reference for the full command list.
Browser Extension
The Nowledge Mem Exchange browser extension captures conversations from supported web AI chat platforms directly in your browser. No file export needed. It supports auto-capture, manual distill, and full thread backup with incremental sync. See the Browser Extension guide for setup.
MCP Tools
| Tool | What it does |
|---|---|
thread_persist | Save a coding session as a conversation thread |
thread_search | Search threads by keywords or list recent threads |
thread_fetch_messages | Fetch full messages from a specific thread |
Next Steps
- Memories: What happens after distillation: create, search, and organize knowledge
- Library: Import documents alongside your memories
- Browser Extension: Capture conversations from web AI platforms
- Integrations: Connect your AI tools through native integrations, reusable packages, or MCP
- API Reference: Full REST API documentation