Threads
Capture, browse, search, and distill conversations from any AI tool.
Threads are your conversation layer. They keep the original flow of what happened in an AI session: what you asked, what the tool answered, and how the work evolved.
Their real value is not just storage. Threads become useful when you can search past conversations, reopen exact context, and distill the durable parts into memories.
The First Useful Thread
If you are new, do one of these first:
- import one conversation you already care about
- let one supported tool save a real thread
- capture one web conversation through the browser extension
Then open that thread and distill one useful memory from it. That is the core workflow.
| I want to... | Jump to |
|---|---|
| Browse and search threads | Browsing Threads |
| Distill threads into memories | Thread Distillation |
| Auto-import from coding agents | Auto-Sync |
| Capture from web AI chats | Browser Extension |
| Import my ChatGPT or DeepSeek conversations | Bulk Import |
Import a conversation file (.md with ## User/## Assistant headers) | Single Thread |
| Learn the .md conversation format | Conversation Markdown |
| Import via API | Import API |
| Import via CLI | CLI |
Browsing Threads
The Threads view shows all your imported conversations in one place.
- Search threads by content or title
- Filter by source (Claude Code, ChatGPT, Cursor, etc.)
- Pin important threads to keep them accessible
- View individual messages within any thread
Open Threads from the sidebar or press Cmd + 3 (macOS).
Thread Distillation
The key workflow that connects threads to memories. Open any thread and trigger distillation. The system extracts individual memories from the conversation, each with its own title, labels, and importance score. The extracted memories enter your knowledge graph and become searchable alongside everything else.
This is how hours of AI conversation become connected knowledge you can find later.
Also from the browser
The browser extension supports Smart Distill directly from web conversations, so you can capture memories without importing the full thread first.
Auto-Sync
In-App Discovery
Scan your machine for conversations from local coding assistants. No file export needed.
Local-only discovery
This path scans conversation files on the machine running Nowledge Mem. It is excellent for local sync, but it is different from nmem t save --from ..., which reads local session files client-side and can still upload normalized threads to a remote Mem server.
| Client | Sync Mode | Where |
|---|---|---|
| Claude Code | Auto-discovery + incremental sync | Threads → Import → Find AI Conversations |
| Cursor | Auto-discovery + incremental sync | Threads → Import → Find AI Conversations |
| Codex | Auto-discovery + incremental sync | Threads → Import → Find AI Conversations |
| OpenCode | Auto-discovery + incremental sync | Threads → Import → Find AI Conversations |
Native Capture And Save Surfaces
Different integrations expose different thread surfaces. Some import the real session transcript. Some auto-capture on lifecycle events. Cursor currently keeps plugin handoff summaries separate from real thread import.
| Integration | What it saves | How it works | Setup |
|---|---|---|---|
| Claude Code | Real session thread | Stop hook auto-saves the session when you exit. Also supports explicit /save. nmem reads local session files client-side before upload. | Claude Code guide |
| Gemini CLI | Real session thread plus separate handoff summary | save-thread imports the real Gemini session through nmem t save --from gemini-cli. save-handoff stays a separate resumable summary. | Gemini CLI guide |
| Cursor | Handoff summary in the plugin | The Cursor plugin intentionally exposes save-handoff, not save-thread. Use in-app discovery for local Cursor conversation import until a real live session importer exists. | Cursor guide |
| Alma | Real session thread | Optional auto-capture saves the session on app quit. | Alma guide |
| OpenClaw | Real session thread | Captures every agent session automatically at completion, with optional LLM distillation. | OpenClaw guide |
| Codex CLI | Real session thread | Explicit /save imports the real Codex session. | Codex CLI guide |
Real thread save vs handoff summary
If you need exact past conversation history, use a real thread-save or import surface. Handoff summaries are for resumable continuity, not lossless transcript storage.
File Import
Bulk Import
Import all conversations from an export file at once.
| Source | File Format | How to Export |
|---|---|---|
| ChatGPT | chat.html | ChatGPT Settings → Data controls → Export data |
| DeepSeek | deepseek_conversations.json | chat.deepseek.com → Settings → Data → Export data |
| ChatWise | .zip (contains JSON files) | Export all chats from ChatWise app |
Single Thread
Import one conversation from a file.
| Format | File Type | Notes |
|---|---|---|
| Conversation Markdown | .md | ## User / ## Assistant / ## System headers, optional YAML frontmatter |
| Cursor | .md | Cursor's native export format (auto-detected) |
| Generic Markdown | .md | Any markdown file, imported as a document |
Documents vs. conversations
If your .md file is a regular document (no ## User / ## Assistant headers), it belongs in the Library, not Threads. Drag it into the Timeline or import from the Library view.
Conversation Markdown Format
The portable format for conversation import. Any tool that writes ## User / ## Assistant headers produces a file Nowledge Mem can read.
Minimal Example
The simplest valid file, two turns, no frontmatter:
## User
What is Python's GIL?
## Assistant
The Global Interpreter Lock (GIL) is a mutex in CPython that allows only one thread to execute Python bytecode at a time. This means CPU-bound multi-threaded programs won't see speedups from threading — use multiprocessing or async I/O instead.Full Example
With optional YAML frontmatter and a system message:
---
title: Python Async Patterns
source: chatgpt
date: 2025-06-15
---
## System
You are a senior Python developer who explains concepts clearly.
## User
How does async/await work in Python?
## Assistant
Python's `async`/`await` lets you write concurrent code that doesn't block while waiting for I/O. An `async def` function returns a coroutine, and `await` pauses it until the result is ready — meanwhile other coroutines can run.
## User
When should I use asyncio vs threading?
## Assistant
Use **asyncio** for I/O-bound work (HTTP requests, database queries, file reads) — it's lighter and scales better than threads. Use **threading** when you need to call blocking libraries that don't support async. Use **multiprocessing** for CPU-bound work.Format Rules
- Headers:
## User,## Assistant, or## System, level-2 heading, one per message - Content: Everything between headers is one message. Markdown formatting, code blocks, and lists are preserved as-is
- Frontmatter: Optional YAML block at the top. Supported fields:
title,source,date, all optional - Detection: Files with at least one
## Useror## Assistantheader are recognized as conversations automatically - Fallback: Files without recognized headers are imported as a single document message
- Case: Role names are matched case-insensitively (
## userand## Userboth work)
Import API
POST /threads/import accepts JSON messages or Conversation Markdown, with single and batch modes.
Single Thread (JSON messages)
curl -X POST http://127.0.0.1:14242/threads/import \
-H "Content-Type: application/json" \
-d '{
"title": "My Conversation",
"source": "chatgpt",
"messages": [
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hi there! How can I help?"}
]
}'Single Thread (Markdown)
curl -X POST http://127.0.0.1:14242/threads/import \
-H "Content-Type: application/json" \
-d '{
"markdown_content": "## User\n\nHello!\n\n## Assistant\n\nHi there! How can I help?"
}'Batch Import
curl -X POST http://127.0.0.1:14242/threads/import \
-H "Content-Type: application/json" \
-d '{
"threads": [
{
"title": "Thread 1",
"messages": [
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi"}
]
},
{
"title": "Thread 2",
"markdown_content": "## User\n\nGoodbye\n\n## Assistant\n\nSee you!"
}
]
}'Thread IDs are auto-generated when omitted. Titles are inferred from markdown frontmatter when available.
CLI
The nmem CLI supports thread import from files, JSON, or stdin.
# Import a conversation markdown file
nmem t import --file conversation.md
# Import with explicit title and source
nmem t import --file chat.md --title "Python Async" --source chatgpt
# Import from JSON messages
nmem t import --messages '[{"role":"user","content":"Hello"},{"role":"assistant","content":"Hi"}]'
# Pipe markdown from stdin
cat conversation.md | nmem t import --stdin --title "Piped Conversation"Run nmem t import --help for all options. See the CLI Reference for the full command list.
Browser Extension
The Nowledge Mem Exchange browser extension captures conversations from supported web AI chat platforms directly in your browser. No file export needed. It supports auto-capture, manual distill, and full thread backup with incremental sync. See the Browser Extension guide for setup.
MCP Tools
| Tool | What it does |
|---|---|
thread_persist | Save a coding session as a conversation thread |
thread_search | Search threads by keywords or list recent threads |
thread_fetch_messages | Fetch full messages from a specific thread |
Next Steps
- Memories: What happens after distillation: create, search, and organize knowledge
- Library: Import documents alongside your memories
- Browser Extension: Capture conversations from web AI platforms
- Integrations: Connect your AI tools through native integrations, reusable packages, or MCP
- API Reference: Full REST API documentation