* feat: integrate NOFXi agent into dev * Enhance NOFXi agent workflow and diagnostics
11 KiB
NOFXi Agent Memory And Planning Design
Purpose
This document explains how the current NOFXi agent handles:
- short-term conversation memory
- durable task memory
- durable execution / planning state
- planner execution and replanning
- state reset and resume behavior
The implementation described here is primarily in:
agent/history.goagent/memory.goagent/execution_state.goagent/planner_runtime.goagent/agent.go
High-Level Model
The current agent uses three different layers of state:
-
chatHistoryRecent in-memory user/assistant turns for the live conversation. -
TaskStateDurable summarized context that should survive beyond recent turns. -
ExecutionStateDurable workflow state for the currently running or recently blocked plan.
These three layers serve different purposes and should not be treated as the same thing.
State Layers
1. chatHistory
Defined in agent/history.go.
Role:
- stores recent
user/assistantmessages in memory - keyed by
userID - used as short-term conversational context
- acts as the source material for later compression into
TaskState
Characteristics:
- in-memory only
- capped by
maxTurns - cleared by
/clear - not suitable as durable truth
Typical contents:
- the last few user questions
- the last few assistant replies
- temporary conversational wording
2. TaskState
Defined in agent/memory.go.
Role:
- stores durable, structured, non-derivable context
- persisted through
system_config - injected into planning and reasoning prompts
Storage key:
agent_task_state_<userID>
Fields:
CurrentGoalActiveFlowOpenLoopsImportantFactsLastDecisionUpdatedAt
Intended contents:
- user goal that still matters across turns
- high-level unresolved issues that still matter across turns
- facts that tools cannot cheaply re-fetch
- latest important decision summary
Explicitly not intended for:
- step-level pending items such as "wait for API key"
- execution actions such as "call get_exchange_configs"
- live balances
- current positions
- current market prices
- mutable configuration availability
Those should be checked from tools at planning time instead of being trusted from old summaries.
3. ExecutionState
Defined in agent/execution_state.go.
Role:
- stores the current execution workflow
- allows the agent to resume after
ask_user - persists plan steps, observations, and completion status
Storage key:
agent_execution_state_<userID>
Fields:
SessionIDUserIDGoalStatusPlanIDStepsCurrentStepIDObservationsFinalAnswerLastErrorUpdatedAt
This is the planner's working state, not a general memory store.
Data Flow
Request Entry
Entry points:
HandleMessage(...)HandleMessageStream(...)
Flow:
- user message enters
agent - slash commands and explicit direct branches are handled first
- all other requests go into planner flow via
thinkAndAct(...)/thinkAndActStream(...)
Planner Flow
The planner pipeline in agent/planner_runtime.go is:
- append user message into
chatHistory - emit
planningSSE event - load
ExecutionState - optionally reset stale
ExecutionState - optionally refresh dynamic configuration snapshots
- create a fresh execution plan with the LLM
- execute steps one by one
- persist
ExecutionStateafter important transitions - append assistant answer into
chatHistory - maybe compress old conversation into
TaskState
Short-Term vs Durable Memory
What lives in chatHistory
Good fits:
- raw recent messages
- conversational wording
- latest assistant phrasing
Bad fits:
- long-lived truths
- current external system state
What lives in TaskState
Good fits:
- durable goal
- high-level unfinished work that remains relevant across turns
- important facts the user stated
- previous decisions and why they were made
Bad fits:
- pending steps inside the current plan
- execution-level reminders such as "wait for a field" or "call a tool"
- old conclusions about whether tools exist
- old conclusions about whether model/exchange config is present
- live operational state that can change outside the chat
What lives in ExecutionState
Good fits:
- current plan steps
- observations from tool calls
- blocked-on-user-input status
- exact current workflow state
- step-level pending work and block reasons
Bad fits:
- evergreen user profile
- long-term semantic memory
Planning Logic
Plan Creation
createExecutionPlan(...) sends the following into the planner model:
- available tool definitions
- persistent preferences
TaskStatecontextExecutionStateJSON- current user request
The planner must return JSON only with step types:
toolreasonask_userrespond
Step Execution
executePlan(...) executes the plan loop:
toolcall tool and append observationreasonrun reasoning sub-call and append observationask_usersavewaiting_userstate and return questionrespondgenerate final answer and mark completed
After each completed step, replanAfterStep(...) may:
- continue
- replace remaining steps
- ask user
- finish
Resume Behavior
When ExecutionState.Status == waiting_user, the next user turn is treated as a reply to the pending question.
Current safeguards:
- latest asked question is extracted from the stored plan
- the user reply is appended as a
user_replyobservation - planner prompt receives explicit
Resume context
This prevents short replies like 是 from being misread as unrelated fresh intents as often as before.
Dynamic State Refresh
Configuration and trader management requests are dynamic by nature. Their truth can change outside the current chat, for example:
- user configures exchange in the UI
- user adds model in another tab
- user creates trader elsewhere
Because of that, configuration/trader requests should not trust stale model conclusions.
Current protection in planner_runtime.go:
- detects config / trader intent with
isConfigOrTraderIntent(...) - clears
TaskStatecontext from the planner prompt for these requests - refreshes
ExecutionState.Observationswith fresh snapshots from:toolGetModelConfigs(...)toolGetExchangeConfigs(...)toolListTraders(...)
This makes the planner rely more on current system state and less on older narrative memory.
Reset Strategy
The system currently resets or weakens stale execution state when:
- user says retry-like phrases such as
再试,继续,try again,continue - request is config / trader related and old execution state is failed / completed / waiting
Reset scope:
ExecutionStatemay be clearedTaskStateis not globally deleted, but it is intentionally ignored for config/trader planning
Manual reset:
/clear
This clears:
- short-term chat history
- task state
- execution state
Compression Design
maybeCompressHistory(...) moves older short-term chat content into TaskState when:
- recent message count exceeds the configured window
- estimated token count exceeds the threshold
Compression strategy:
- keep recent conversation in
chatHistory - summarize older turns into structured
TaskState - persist new
TaskState - replace
chatHistorywith recent slice
Important design rule:
TaskStateshould keep durable context only- it should not become a stale copy of mutable operational state
Current Architecture Diagram
flowchart TD
U[User Message] --> A[HandleMessage / HandleMessageStream]
A --> B{Direct command?}
B -->|Yes| C[Direct branch or slash command]
B -->|No| D[thinkAndAct / thinkAndActStream]
D --> E[Append user turn to chatHistory]
D --> F[Load ExecutionState]
F --> G{waiting_user?}
G -->|Yes| H[Attach user_reply observation]
G -->|No| I[Create fresh ExecutionState]
H --> J[Refresh dynamic snapshots if config/trader intent]
I --> J
J --> K[createExecutionPlan via LLM]
K --> L[Execution plan]
L --> M[executePlan loop]
M --> N[tool step]
M --> O[reason step]
M --> P[ask_user step]
M --> Q[respond step]
N --> R[Append Observation]
O --> R
R --> S[replanAfterStep]
S --> M
P --> T[Persist waiting_user ExecutionState]
T --> UQ[Return question to user]
Q --> V[Persist completed ExecutionState]
V --> W[Append assistant turn to chatHistory]
W --> X[maybeCompressHistory]
X --> Y[Persist TaskState]
Y --> Z[Final response]
Memory Relationship Diagram
flowchart LR
CH[chatHistory\nin-memory\nrecent turns]
TS[TaskState\npersisted summary\nsystem_config]
ES[ExecutionState\npersisted workflow\nsystem_config]
PL[Planner Prompt]
CH -->|recent raw turns| PL
ES -->|current workflow JSON| PL
TS -->|durable structured context| PL
CH -->|old turns compressed| TS
PL -->|plan / observations / status| ES
State Transition Diagram
stateDiagram-v2
[*] --> planning
planning --> running: plan created
running --> waiting_user: ask_user step
waiting_user --> planning: user replies
running --> completed: respond step finished
running --> failed: step error
failed --> planning: retry / continue / config-trader reset
completed --> planning: new relevant request or retry flow
Known Design Tradeoffs
Strengths
- separates short-term chat from durable task summary
- allows blocked flows to resume
- supports replanning after every meaningful step
- can recover from stale assumptions better for dynamic config/trader requests
Weaknesses
TaskStateis still summary-driven, so summarization quality matters- planner still depends on model compliance for some transitions
ExecutionStateis single-track per user, not multiple concurrent workflows- config/trader intent detection is heuristic and keyword-based
Practical Guidance
When to trust TaskState
Trust it for:
- user intent continuity
- open loops
- durable facts
Do not trust it for:
- whether current exchange/model/trader config exists now
- whether a specific operational action is currently possible
When to trust ExecutionState
Trust it for:
- current plan continuity
- exact blocked step
- latest observation chain
Do not trust it blindly when:
- user has changed configuration outside the chat
- the system capabilities changed after deployment
When to fetch live state again
Always prefer fresh tool snapshots before answering about:
- existing model configs
- existing exchange configs
- existing traders
- whether trader creation can proceed
Suggested Future Improvements
- add workflow versioning so capability changes invalidate stale
ExecutionState - separate
waiting_user_confirmationfrom genericwaiting_user - introduce code-level handling for short confirmations such as
是,好,继续 - move dynamic state refresh from heuristic to explicit planner preflight stage
- support multiple concurrent execution sessions per user if needed