Threads are your conversation layer. They keep the original flow of what happened in an AI session: what you asked, what the tool answered, and how the work evolved.

Their real value is not just storage. Threads become useful when you can search past conversations, reopen exact context, and distill the durable parts into memories.

Use threads when you need the full history

If you need the original messages, save or import a thread. If you only need the durable takeaway, distill it into memories and work from there.

Import onboarding map

If you are deciding how to bring existing conversations in (bulk file vs coding-agent scan vs browser vs one markdown file), read Import existing conversations first. This page remains the format and feature reference for Threads.

The First Useful Thread

If you are new, do one of these first:

import one conversation you already care about
let one supported tool capture a full session
capture one web conversation through the browser extension

Then open that thread and distill one useful memory from it. That is the core workflow.

I want to...	Jump to
See every import path in one overview	Import existing conversations
Browse and search threads	Browsing Threads
Distill threads into memories	Thread Distillation
Auto-import from coding agents	Auto-Sync
Capture from web AI chats	Browser Extension
Import my ChatGPT, DeepSeek, or Raycast AI (exporter JSON) conversations	Bulk Import
Import a conversation file (`.md` with `## User`/`## Assistant` headers)	Single Thread
Learn the .md conversation format	Conversation Markdown
Import via API	Import API
Import via CLI	CLI

Browsing Threads

The Threads view shows all your imported conversations in one place.

Search threads by content or title
Filter by source (Claude Code, ChatGPT, Cursor, etc.)
Pin important threads to keep them accessible
View individual messages within any thread

Open Threads from the sidebar or press Cmd + 3 (macOS).

Thread Distillation

The key workflow that connects threads to memories. Open any thread and trigger distillation. The system extracts individual memories from the conversation, each with its own title, labels, and importance score. The extracted memories enter your knowledge graph and become searchable alongside everything else.

This is how hours of AI conversation become connected knowledge you can find later.

For normal-sized threads, distillation runs right away. For very large threads, Mem now offers Smart Background Distillation instead: it starts a few seconds later, lets the Knowledge Agent read the thread progressively, and then saves a smaller set of durable memories. This is slower than a short-thread distill, but much safer than forcing one foreground request to read everything at once.

Also from the browser

The browser extension supports Smart Distill directly from web conversations, so you can capture memories without importing the full thread first.

How Threads Reach Mem

Threads can enter Nowledge Mem through several different paths. They are related, but they are not the same:

Dedicated plugins and extensions: tool-specific integrations like Claude Code, Codex, Gemini CLI, OpenClaw, Hermes Agent, Alma, Cursor, Droid, OpenCode, and Copilot CLI
Local auto-sync: in-app discovery that watches supported coding-agent conversations on your machine
Shared skills or prompt packs: reusable setups like npx skills
Browser capture: the Exchange extension works with the focused web session you use in the panel (not your full chat account); see Import existing conversations
Manual import: files, exports, API calls, and CLI imports

The main thing to understand is this:

Full session capture means Mem receives the actual recorded conversation from that tool
Handoff summary means Mem stores a concise continuation note instead of the full session

Shared skills matter here too. They are useful across many agents, but they cannot honestly promise full session capture unless that host runtime exposes readable session files or a stable transcript API.

Most users only need one rule:

if your tool already has a real thread-save path, use it
if it only supports handoff summaries today, keep that mental model clear and use import or auto-sync for full history

Auto-Sync

In-App Discovery

Scan your machine for conversations from local coding assistants. No file export needed.

Local-only discovery

This path scans conversation files on the machine running Nowledge Mem. It is excellent for local sync, but it is different from nmem t sync --from ..., which reads local session files client-side and can still upload normalized threads to a remote Mem server.

What happens after the first import

Importing saves that conversation as a thread right away. If Auto-Sync is on, Mem can then append new messages to that thread, and it can remember the detected project for later sessions when that app exposes a stable project path.

Client	Sync Mode	Where
Claude Code	Auto-discovery + incremental sync	Threads → Import → Find AI Conversations
Cursor	Auto-discovery + incremental sync	Threads → Import → Find AI Conversations
Codex	Auto-discovery + incremental sync	Threads → Import → Find AI Conversations
OpenCode	Auto-discovery + incremental sync	Threads → Import → Find AI Conversations

Native Capture And Save Paths

Different integrations expose different thread-save behavior. Some support full session capture. Some auto-capture on lifecycle events. Droid and Cursor currently keep plugin handoff summaries separate from full conversation import.

Integration	What it saves	How it works	Setup
Claude Code	Full session capture	Stop hook auto-saves the session after each response. Also supports explicit `/save`. `nmem` reads local session files client-side before upload.	Claude Code guide
Gemini CLI	Full session capture plus separate handoff summary	`save-thread` imports the recorded Gemini session through `nmem t save --from gemini-cli`. The extension also imports before compression and at session end. `save-handoff` stays a separate resumable summary.	Gemini CLI guide
Droid	Handoff summary in the plugin	The Droid plugin intentionally exposes `save-handoff`, not `save-thread`. It provides Working Memory, routed recall, and resumable checkpoints now, while leaving transcript-backed thread save for a future real importer.	Droid guide
Cursor	Handoff summary in the plugin	The Cursor plugin intentionally exposes `save-handoff`, not `save-thread`. Use in-app discovery for local Cursor conversation import until a real live session importer exists.	Cursor guide
Alma	Full session capture	Live sync saves conversations after 2 min idle, on thread switch, and on quit (on by default).	Alma guide
OpenClaw	Full session capture	Captures every agent session automatically at completion, with optional LLM distillation.	OpenClaw guide
Hermes Agent	Completed-turn capture with final boundary flush	The native memory provider writes cleaned `user` / `assistant` turns as replies complete, then flushes remaining changes on clean exit, `/new`, and `/reset`.	Hermes guide
Codex	Full session capture	Stop hook captures the recorded Codex session after each turn. Explicit save remains available as a fallback.	Codex guide
Copilot CLI	Full session capture	Capture hooks append newly recorded Copilot conversation content into Mem after each response, before compaction, and at session end. Explicit save/checkpoint requests still create a concise summary thread when needed.	Copilot CLI guide
Generic `npx skills` agents	Handoff summary only	Use `save-handoff`. Shared skills can guide saving, but they do not control the host runtime well enough to promise transcript-backed import everywhere.	Connectors overview

Full session capture vs handoff summary

If you need exact past conversation history, use a full-session capture or import path. Handoff summaries are for resumable continuity, not full conversation storage.

File Import

Bulk Import

Import all conversations from an export file at once.

Source	File Format	How to Export
ChatGPT	`chat.html`	ChatGPT Settings → Data controls → Export data
Claude	`data-…-batch-….zip` (`conversations.json`, `memories.json`)	claude.ai or Claude Desktop: avatar → Settings → Privacy → Export data (Anthropic guide). Not available from Claude mobile apps.
DeepSeek	`deepseek_conversations.json`	chat.deepseek.com → Settings → Data → Export data
ChatWise	`.zip` (contains JSON files)	Export all chats from ChatWise app
Alma	`alma-backup-YYYY-MM-DD.zip` (contains `threads.json`)	Alma Settings → Data → Export all threads
Raycast AI	`.json` (e.g. `raycast_ai_chats.json`)	No vendor export, so use raycast-ai-exporter on macOS (see that README)

Search first, extract when it matters

Bulk Import saves the original conversations as Threads. They are searchable immediately, so you do not need to distill hundreds of imported chats before Mem can find them. Distillation is the second step: use it for conversations that contain decisions, procedures, preferences, or lessons worth keeping as long-term Memories. For a large archive, plan extraction in small batches from the Timeline or by selecting specific threads.

Single Thread

Import one conversation from a file.

Format	File Type	Notes
Conversation Markdown	`.md`	`## User` / `## Assistant` / `## System` headers, optional YAML frontmatter
Cursor	`.md`	Cursor's native export format (auto-detected)
Generic Markdown	`.md`	Any markdown file, imported as a document

Documents vs. conversations

If your .md file is a regular document (no ## User / ## Assistant headers), it belongs in the Library, not Threads. Drag it into the Timeline or import from the Library view.

Conversation Markdown Format

The portable format for conversation import. Any tool that writes ## User / ## Assistant headers produces a file Nowledge Mem can read.

Minimal Example

The simplest valid file, two turns, no frontmatter:

## User

What is Python's GIL?

## Assistant

The Global Interpreter Lock (GIL) is a mutex in CPython that allows only one thread to execute Python bytecode at a time. This means CPU-bound multi-threaded programs won't see speedups from threading — use multiprocessing or async I/O instead.

Full Example

With optional YAML frontmatter and a system message:

---
title: Python Async Patterns
source: chatgpt
date: 2025-06-15
---

## System

You are a senior Python developer who explains concepts clearly.

## User

How does async/await work in Python?

## Assistant

Python's `async`/`await` lets you write concurrent code that doesn't block while waiting for I/O. An `async def` function returns a coroutine, and `await` pauses it until the result is ready — meanwhile other coroutines can run.

## User

When should I use asyncio vs threading?

## Assistant

Use **asyncio** for I/O-bound work (HTTP requests, database queries, file reads) — it's lighter and scales better than threads. Use **threading** when you need to call blocking libraries that don't support async. Use **multiprocessing** for CPU-bound work.

Format Rules

Headers: ## User, ## Assistant, or ## System, level-2 heading, one per message
Content: Everything between headers is one message. Markdown formatting, code blocks, and lists are preserved as-is
Frontmatter: Optional YAML block at the top. Supported fields: title, source, date, all optional
Detection: Files with at least one ## User or ## Assistant header are recognized as conversations automatically
Fallback: Files without recognized headers are imported as a single document message
Case: Role names are matched case-insensitively (## user and ## User both work)

Conversation Markdown

A complete example with all supported features

Import API

POST /threads/import accepts JSON messages or Conversation Markdown, with single and batch modes.

Single Thread (JSON messages)

curl -X POST http://127.0.0.1:14242/threads/import \
  -H "Content-Type: application/json" \
  -d '{
    "title": "My Conversation",
    "source": "chatgpt",
    "messages": [
      {"role": "user", "content": "Hello!"},
      {"role": "assistant", "content": "Hi there! How can I help?"}
    ]
  }'

Single Thread (Markdown)

curl -X POST http://127.0.0.1:14242/threads/import \
  -H "Content-Type: application/json" \
  -d '{
    "markdown_content": "## User\n\nHello!\n\n## Assistant\n\nHi there! How can I help?"
  }'

Batch Import

curl -X POST http://127.0.0.1:14242/threads/import \
  -H "Content-Type: application/json" \
  -d '{
    "threads": [
      {
        "title": "Thread 1",
        "messages": [
          {"role": "user", "content": "Hello"},
          {"role": "assistant", "content": "Hi"}
        ]
      },
      {
        "title": "Thread 2",
        "markdown_content": "## User\n\nGoodbye\n\n## Assistant\n\nSee you!"
      }
    ]
  }'

Thread IDs are auto-generated when omitted. Titles are inferred from markdown frontmatter when available.

Import API Reference

Full request/response schema and field descriptions

CLI

The nmem CLI supports thread import from files, JSON, or stdin.

# Import a conversation markdown file
nmem t import --file conversation.md

# Import with explicit title and source
nmem t import --file chat.md --title "Python Async" --source chatgpt

# Import from JSON messages
nmem t import --messages '[{"role":"user","content":"Hello"},{"role":"assistant","content":"Hi"}]'

# Bulk JSON from Raycast AI (raycast-ai-exporter output)
nmem t import --file ~/Desktop/raycast_ai_chats.json

# Pipe markdown from stdin
cat conversation.md | nmem t import --stdin --title "Piped Conversation"

Run nmem t import --help for all options. See the CLI Reference for the full command list.

Browser Extension

The Nowledge Mem Exchange extension works in the browser on supported AI chat pages, on the session you are actually using with the extension (for example the tab you open in the side panel). It is not a bulk downloader for your entire web chat history; for that, use each vendor’s export plus Bulk import in Mem, or see Import existing conversations. Auto-capture, manual distill, and backing up the current thread are covered in the Browser Extension guide.

MCP Tools

Tool	What it does
`thread_search`	Search threads by keywords or list recent threads
`thread_fetch_messages`	Fetch full messages from a specific thread
`search_thread_messages`	Search within a thread for messages matching keywords

MCP thread tools search and read threads that already exist in Mem. To capture local coding-agent transcripts, use the native connector for that tool. To backfill older sessions, run nmem t sync --from claude-code, codex, gemini-cli, opencode, or pi on the client machine running the agent. That keeps transcript discovery local and uploads the normalized thread to your Mem server.

Next Steps

Memories: What happens after distillation: create, search, and organize knowledge
Library: Import documents alongside your memories
Browser Extension: Capture conversations from web AI platforms
Connectors: Connect your AI tools through native connectors, reusable packages, or MCP
API Reference: Full REST API documentation

Threads are your conversation layer. They keep the original flow of what happened in an AI session: what you asked, what the tool answered, and how the work evolved.

Their real value is not just storage. Threads become useful when you can search past conversations, reopen exact context, and distill the durable parts into memories.

Use threads when you need the full history

If you need the original messages, save or import a thread. If you only need the durable takeaway, distill it into memories and work from there.

Import onboarding map

The First Useful Thread

If you are new, do one of these first:

import one conversation you already care about
let one supported tool capture a full session
capture one web conversation through the browser extension

Then open that thread and distill one useful memory from it. That is the core workflow.

I want to...	Jump to
See every import path in one overview	Import existing conversations
Browse and search threads	Browsing Threads
Distill threads into memories	Thread Distillation
Auto-import from coding agents	Auto-Sync
Capture from web AI chats	Browser Extension
Import my ChatGPT, DeepSeek, or Raycast AI (exporter JSON) conversations	Bulk Import
Import a conversation file (`.md` with `## User`/`## Assistant` headers)	Single Thread
Learn the .md conversation format	Conversation Markdown
Import via API	Import API
Import via CLI	CLI

Browsing Threads

The Threads view shows all your imported conversations in one place.

Search threads by content or title
Filter by source (Claude Code, ChatGPT, Cursor, etc.)
Pin important threads to keep them accessible
View individual messages within any thread

Open Threads from the sidebar or press Cmd + 3 (macOS).

Thread Distillation

This is how hours of AI conversation become connected knowledge you can find later.

Also from the browser

The browser extension supports Smart Distill directly from web conversations, so you can capture memories without importing the full thread first.

How Threads Reach Mem

Threads can enter Nowledge Mem through several different paths. They are related, but they are not the same:

Dedicated plugins and extensions: tool-specific integrations like Claude Code, Codex, Gemini CLI, OpenClaw, Hermes Agent, Alma, Cursor, Droid, OpenCode, and Copilot CLI
Local auto-sync: in-app discovery that watches supported coding-agent conversations on your machine
Shared skills or prompt packs: reusable setups like npx skills
Browser capture: the Exchange extension works with the focused web session you use in the panel (not your full chat account); see Import existing conversations
Manual import: files, exports, API calls, and CLI imports

The main thing to understand is this:

Full session capture means Mem receives the actual recorded conversation from that tool
Handoff summary means Mem stores a concise continuation note instead of the full session

Most users only need one rule:

if your tool already has a real thread-save path, use it
if it only supports handoff summaries today, keep that mental model clear and use import or auto-sync for full history

Auto-Sync

In-App Discovery

Scan your machine for conversations from local coding assistants. No file export needed.

Local-only discovery

What happens after the first import

Client	Sync Mode	Where
Claude Code	Auto-discovery + incremental sync	Threads → Import → Find AI Conversations
Cursor	Auto-discovery + incremental sync	Threads → Import → Find AI Conversations
Codex	Auto-discovery + incremental sync	Threads → Import → Find AI Conversations
OpenCode	Auto-discovery + incremental sync	Threads → Import → Find AI Conversations

Native Capture And Save Paths

Integration	What it saves	How it works	Setup
Claude Code	Full session capture	Stop hook auto-saves the session after each response. Also supports explicit `/save`. `nmem` reads local session files client-side before upload.	Claude Code guide
Gemini CLI	Full session capture plus separate handoff summary	`save-thread` imports the recorded Gemini session through `nmem t save --from gemini-cli`. The extension also imports before compression and at session end. `save-handoff` stays a separate resumable summary.	Gemini CLI guide
Droid	Handoff summary in the plugin	The Droid plugin intentionally exposes `save-handoff`, not `save-thread`. It provides Working Memory, routed recall, and resumable checkpoints now, while leaving transcript-backed thread save for a future real importer.	Droid guide
Cursor	Handoff summary in the plugin	The Cursor plugin intentionally exposes `save-handoff`, not `save-thread`. Use in-app discovery for local Cursor conversation import until a real live session importer exists.	Cursor guide
Alma	Full session capture	Live sync saves conversations after 2 min idle, on thread switch, and on quit (on by default).	Alma guide
OpenClaw	Full session capture	Captures every agent session automatically at completion, with optional LLM distillation.	OpenClaw guide
Hermes Agent	Completed-turn capture with final boundary flush	The native memory provider writes cleaned `user` / `assistant` turns as replies complete, then flushes remaining changes on clean exit, `/new`, and `/reset`.	Hermes guide
Codex	Full session capture	Stop hook captures the recorded Codex session after each turn. Explicit save remains available as a fallback.	Codex guide
Copilot CLI	Full session capture	Capture hooks append newly recorded Copilot conversation content into Mem after each response, before compaction, and at session end. Explicit save/checkpoint requests still create a concise summary thread when needed.	Copilot CLI guide
Generic `npx skills` agents	Handoff summary only	Use `save-handoff`. Shared skills can guide saving, but they do not control the host runtime well enough to promise transcript-backed import everywhere.	Connectors overview

Full session capture vs handoff summary

If you need exact past conversation history, use a full-session capture or import path. Handoff summaries are for resumable continuity, not full conversation storage.

File Import

Bulk Import

Import all conversations from an export file at once.

Source	File Format	How to Export
ChatGPT	`chat.html`	ChatGPT Settings → Data controls → Export data
Claude	`data-…-batch-….zip` (`conversations.json`, `memories.json`)	claude.ai or Claude Desktop: avatar → Settings → Privacy → Export data (Anthropic guide). Not available from Claude mobile apps.
DeepSeek	`deepseek_conversations.json`	chat.deepseek.com → Settings → Data → Export data
ChatWise	`.zip` (contains JSON files)	Export all chats from ChatWise app
Alma	`alma-backup-YYYY-MM-DD.zip` (contains `threads.json`)	Alma Settings → Data → Export all threads
Raycast AI	`.json` (e.g. `raycast_ai_chats.json`)	No vendor export, so use raycast-ai-exporter on macOS (see that README)

Search first, extract when it matters

Single Thread

Import one conversation from a file.

Format	File Type	Notes
Conversation Markdown	`.md`	`## User` / `## Assistant` / `## System` headers, optional YAML frontmatter
Cursor	`.md`	Cursor's native export format (auto-detected)
Generic Markdown	`.md`	Any markdown file, imported as a document

Documents vs. conversations

If your .md file is a regular document (no ## User / ## Assistant headers), it belongs in the Library, not Threads. Drag it into the Timeline or import from the Library view.

Conversation Markdown Format

The portable format for conversation import. Any tool that writes ## User / ## Assistant headers produces a file Nowledge Mem can read.

Minimal Example

The simplest valid file, two turns, no frontmatter:

## User

What is Python's GIL?

## Assistant

The Global Interpreter Lock (GIL) is a mutex in CPython that allows only one thread to execute Python bytecode at a time. This means CPU-bound multi-threaded programs won't see speedups from threading — use multiprocessing or async I/O instead.

Full Example

With optional YAML frontmatter and a system message:

---
title: Python Async Patterns
source: chatgpt
date: 2025-06-15
---

## System

You are a senior Python developer who explains concepts clearly.

## User

How does async/await work in Python?

## Assistant

Python's `async`/`await` lets you write concurrent code that doesn't block while waiting for I/O. An `async def` function returns a coroutine, and `await` pauses it until the result is ready — meanwhile other coroutines can run.

## User

When should I use asyncio vs threading?

## Assistant

Use **asyncio** for I/O-bound work (HTTP requests, database queries, file reads) — it's lighter and scales better than threads. Use **threading** when you need to call blocking libraries that don't support async. Use **multiprocessing** for CPU-bound work.

Format Rules

Headers: ## User, ## Assistant, or ## System, level-2 heading, one per message
Content: Everything between headers is one message. Markdown formatting, code blocks, and lists are preserved as-is
Frontmatter: Optional YAML block at the top. Supported fields: title, source, date, all optional
Detection: Files with at least one ## User or ## Assistant header are recognized as conversations automatically
Fallback: Files without recognized headers are imported as a single document message
Case: Role names are matched case-insensitively (## user and ## User both work)

Conversation Markdown

A complete example with all supported features

Import API

POST /threads/import accepts JSON messages or Conversation Markdown, with single and batch modes.

Single Thread (JSON messages)

curl -X POST http://127.0.0.1:14242/threads/import \
  -H "Content-Type: application/json" \
  -d '{
    "title": "My Conversation",
    "source": "chatgpt",
    "messages": [
      {"role": "user", "content": "Hello!"},
      {"role": "assistant", "content": "Hi there! How can I help?"}
    ]
  }'

Single Thread (Markdown)

curl -X POST http://127.0.0.1:14242/threads/import \
  -H "Content-Type: application/json" \
  -d '{
    "markdown_content": "## User\n\nHello!\n\n## Assistant\n\nHi there! How can I help?"
  }'

Batch Import

curl -X POST http://127.0.0.1:14242/threads/import \
  -H "Content-Type: application/json" \
  -d '{
    "threads": [
      {
        "title": "Thread 1",
        "messages": [
          {"role": "user", "content": "Hello"},
          {"role": "assistant", "content": "Hi"}
        ]
      },
      {
        "title": "Thread 2",
        "markdown_content": "## User\n\nGoodbye\n\n## Assistant\n\nSee you!"
      }
    ]
  }'

Thread IDs are auto-generated when omitted. Titles are inferred from markdown frontmatter when available.

Import API Reference

Full request/response schema and field descriptions

CLI

The nmem CLI supports thread import from files, JSON, or stdin.

# Import a conversation markdown file
nmem t import --file conversation.md

# Import with explicit title and source
nmem t import --file chat.md --title "Python Async" --source chatgpt

# Import from JSON messages
nmem t import --messages '[{"role":"user","content":"Hello"},{"role":"assistant","content":"Hi"}]'

# Bulk JSON from Raycast AI (raycast-ai-exporter output)
nmem t import --file ~/Desktop/raycast_ai_chats.json

# Pipe markdown from stdin
cat conversation.md | nmem t import --stdin --title "Piped Conversation"

Run nmem t import --help for all options. See the CLI Reference for the full command list.

Browser Extension

MCP Tools

Tool	What it does
`thread_search`	Search threads by keywords or list recent threads
`thread_fetch_messages`	Fetch full messages from a specific thread
`search_thread_messages`	Search within a thread for messages matching keywords

Next Steps

Memories: What happens after distillation: create, search, and organize knowledge
Library: Import documents alongside your memories
Browser Extension: Capture conversations from web AI platforms
Connectors: Connect your AI tools through native connectors, reusable packages, or MCP
API Reference: Full REST API documentation

Threads

Conversation Markdown

Import API Reference

On this page

Threads

Conversation Markdown

Import API Reference

On this page