Memory and Ultra Mode
Understand how memory works in Astell chat and what Ultra mode is
Memory
Memory controls how much of your past conversation history Astell can carry into a new chat. The goal is continuity. Astell picks up the right context so you don't have to restate background every time. Memory works across threads: when you start a new conversation, Astell can still use relevant context from previous conversations.
How Memory works
When you send a message, Astell does not load your entire chat history into the model. It runs a context-selection step:
- Astell searches across your conversation history, including other threads.
- It ranks prior messages by relevance to your current prompt.
- It includes the most relevant prior messages in the model's context window for the new response.
Standard vs. Expanded Memory
Both Standard and Expanded Memory search across your past conversation history across threads. The difference is how many prior messages can be included in the model's context for a single response:
- Standard: carries forward up to 10 prior messages
- Expanded: same behavior, but carries forward up to 50 prior messages, for complex workstreams
Memory availability by plan
Memory is plan-defined, you don't choose a setting manually. Sapling: Standard Memory. Tree, Grove, and Forest (Enterprise): Expanded Memory.
When Memory helps most
Memory helps most when the right answer depends on what you already discussed: long-running workstreams where decisions and constraints carry over; iterative planning where you continue from prior steps; "what did we decide last time?" questions; and ongoing drafting where intent and constraints matter across sessions.
How to get better results with Memory
- Reference the project or decision explicitly ("Continue the onboarding plan we discussed last week.")
- Ask for deltas ("What changed since our last decision?")
- Reassert constraints when needed ("Same scope, new timeline.")
What Memory does not do
Memory does not expand permissions or override access rules. It also doesn't replace ingestion. Memory is conversation-history continuity, not a mechanism for turning documents into searchable workspace knowledge.
Ultra mode
Ultra mode is a maximum-effort mode for the hardest queries: Astell spends more compute on a single request (more reasoning passes and more parallel work) to reason across your entire workspace context at once and produce one thorough final answer. It's available on Grove and above and works with the Advanced models. You toggle it per chat, with a visible indicator when it's active. Because it does more work, an Ultra-mode request consumes substantially more tokens than a normal request. (Exact behavior is still being finalized.)
What Ultra mode changes
Ultra mode raises the effort and context budget for a request, so the model can incorporate more workspace context across multiple sources in one response. Common use cases: pulling together material across docs, threads, tickets, and pull requests; summarizing a long time range (for example, the last 90 days); and reconciling conflicting notes across sources into one recommendation.
When Ultra mode is worth it (and when it isn't)
Use Ultra mode when you need cross-system synthesis in one pass, the scope is broad (many artifacts, long timeline), you want a final structured output (plan, memo, risks), or a normal response feels like it's missing context. Avoid Ultra mode for quick lookups, short clarifications, basic drafting, and simple follow-ups.
How to prompt in Ultra mode
Because Ultra mode spends extra tokens for extra effort, treat it like a single high-quality request. The strongest prompts include an objective (what "done" looks like), scope (which systems and timeframe), and structure (for example, summary → evidence → risks → recommendation → next steps).
関連記事
関連するヘルプ記事で学習を続けましょう