mirror of https://github.com/laoxong/nofx.git synced 2026-06-04 01:48:22 +08:00

Files

T

lky-spec 3ca95b294d feat: port NOFXi agent module onto latest dev base (#1485 )

* feat: integrate NOFXi agent into dev

* Enhance NOFXi agent workflow and diagnostics

2026-04-21 23:47:55 +08:00

11 KiB

Raw Blame History

NOFXi Agent Memory And Planning Design

Purpose

This document explains how the current NOFXi agent handles:

short-term conversation memory
durable task memory
durable execution / planning state
planner execution and replanning
state reset and resume behavior

The implementation described here is primarily in:

agent/history.go
agent/memory.go
agent/execution_state.go
agent/planner_runtime.go
agent/agent.go

High-Level Model

The current agent uses three different layers of state:

chatHistory Recent in-memory user/assistant turns for the live conversation.
TaskState Durable summarized context that should survive beyond recent turns.
ExecutionState Durable workflow state for the currently running or recently blocked plan.

These three layers serve different purposes and should not be treated as the same thing.

State Layers

1. `chatHistory`

Defined in agent/history.go.

Role:

stores recent user / assistant messages in memory
keyed by userID
used as short-term conversational context
acts as the source material for later compression into TaskState

Characteristics:

in-memory only
capped by maxTurns
cleared by /clear
not suitable as durable truth

Typical contents:

the last few user questions
the last few assistant replies
temporary conversational wording

2. `TaskState`

Defined in agent/memory.go.

Role:

stores durable, structured, non-derivable context
persisted through system_config
injected into planning and reasoning prompts

Storage key:

agent_task_state_<userID>

Fields:

CurrentGoal
ActiveFlow
OpenLoops
ImportantFacts
LastDecision
UpdatedAt

Intended contents:

user goal that still matters across turns
high-level unresolved issues that still matter across turns
facts that tools cannot cheaply re-fetch
latest important decision summary

Explicitly not intended for:

step-level pending items such as "wait for API key"
execution actions such as "call get_exchange_configs"
live balances
current positions
current market prices
mutable configuration availability

Those should be checked from tools at planning time instead of being trusted from old summaries.

3. `ExecutionState`

Defined in agent/execution_state.go.

Role:

stores the current execution workflow
allows the agent to resume after ask_user
persists plan steps, observations, and completion status

Storage key:

agent_execution_state_<userID>

Fields:

SessionID
UserID
Goal
Status
PlanID
Steps
CurrentStepID
Observations
FinalAnswer
LastError
UpdatedAt

This is the planner's working state, not a general memory store.

Data Flow

Request Entry

Entry points:

HandleMessage(...)
HandleMessageStream(...)

Flow:

user message enters agent
slash commands and explicit direct branches are handled first
all other requests go into planner flow via thinkAndAct(...) / thinkAndActStream(...)

Planner Flow

The planner pipeline in agent/planner_runtime.go is:

append user message into chatHistory
emit planning SSE event
load ExecutionState
optionally reset stale ExecutionState
optionally refresh dynamic configuration snapshots
create a fresh execution plan with the LLM
execute steps one by one
persist ExecutionState after important transitions
append assistant answer into chatHistory
maybe compress old conversation into TaskState

Short-Term vs Durable Memory

What lives in `chatHistory`

Good fits:

raw recent messages
conversational wording
latest assistant phrasing

Bad fits:

long-lived truths
current external system state

What lives in `TaskState`

Good fits:

durable goal
high-level unfinished work that remains relevant across turns
important facts the user stated
previous decisions and why they were made

Bad fits:

pending steps inside the current plan
execution-level reminders such as "wait for a field" or "call a tool"
old conclusions about whether tools exist
old conclusions about whether model/exchange config is present
live operational state that can change outside the chat

What lives in `ExecutionState`

Good fits:

current plan steps
observations from tool calls
blocked-on-user-input status
exact current workflow state
step-level pending work and block reasons

Bad fits:

evergreen user profile
long-term semantic memory

Planning Logic

Plan Creation

createExecutionPlan(...) sends the following into the planner model:

available tool definitions
persistent preferences
TaskState context
ExecutionState JSON
current user request

The planner must return JSON only with step types:

tool
reason
ask_user
respond

Step Execution

executePlan(...) executes the plan loop:

tool call tool and append observation
reason run reasoning sub-call and append observation
ask_user save waiting_user state and return question
respond generate final answer and mark completed

After each completed step, replanAfterStep(...) may:

continue
replace remaining steps
ask user
finish

Resume Behavior

When ExecutionState.Status == waiting_user, the next user turn is treated as a reply to the pending question.

Current safeguards:

latest asked question is extracted from the stored plan
the user reply is appended as a user_reply observation
planner prompt receives explicit Resume context

This prevents short replies like 是 from being misread as unrelated fresh intents as often as before.

Dynamic State Refresh

Configuration and trader management requests are dynamic by nature. Their truth can change outside the current chat, for example:

user configures exchange in the UI
user adds model in another tab
user creates trader elsewhere

Because of that, configuration/trader requests should not trust stale model conclusions.

Current protection in planner_runtime.go:

detects config / trader intent with isConfigOrTraderIntent(...)
clears TaskState context from the planner prompt for these requests
refreshes ExecutionState.Observations with fresh snapshots from:
- toolGetModelConfigs(...)
- toolGetExchangeConfigs(...)
- toolListTraders(...)

This makes the planner rely more on current system state and less on older narrative memory.

Reset Strategy

The system currently resets or weakens stale execution state when:

user says retry-like phrases such as 再试, 继续, try again, continue
request is config / trader related and old execution state is failed / completed / waiting

Reset scope:

ExecutionState may be cleared
TaskState is not globally deleted, but it is intentionally ignored for config/trader planning

Manual reset:

/clear

This clears:

short-term chat history
task state
execution state

Compression Design

maybeCompressHistory(...) moves older short-term chat content into TaskState when:

recent message count exceeds the configured window
estimated token count exceeds the threshold

Compression strategy:

keep recent conversation in chatHistory
summarize older turns into structured TaskState
persist new TaskState
replace chatHistory with recent slice

Important design rule:

TaskState should keep durable context only
it should not become a stale copy of mutable operational state

Current Architecture Diagram

flowchart TD
    U[User Message] --> A[HandleMessage / HandleMessageStream]
    A --> B{Direct command?}
    B -->|Yes| C[Direct branch or slash command]
    B -->|No| D[thinkAndAct / thinkAndActStream]

    D --> E[Append user turn to chatHistory]
    D --> F[Load ExecutionState]
    F --> G{waiting_user?}
    G -->|Yes| H[Attach user_reply observation]
    G -->|No| I[Create fresh ExecutionState]

    H --> J[Refresh dynamic snapshots if config/trader intent]
    I --> J
    J --> K[createExecutionPlan via LLM]
    K --> L[Execution plan]
    L --> M[executePlan loop]

    M --> N[tool step]
    M --> O[reason step]
    M --> P[ask_user step]
    M --> Q[respond step]

    N --> R[Append Observation]
    O --> R
    R --> S[replanAfterStep]
    S --> M

    P --> T[Persist waiting_user ExecutionState]
    T --> UQ[Return question to user]

    Q --> V[Persist completed ExecutionState]
    V --> W[Append assistant turn to chatHistory]
    W --> X[maybeCompressHistory]
    X --> Y[Persist TaskState]
    Y --> Z[Final response]

Memory Relationship Diagram

flowchart LR
    CH[chatHistory\nin-memory\nrecent turns]
    TS[TaskState\npersisted summary\nsystem_config]
    ES[ExecutionState\npersisted workflow\nsystem_config]
    PL[Planner Prompt]

    CH -->|recent raw turns| PL
    ES -->|current workflow JSON| PL
    TS -->|durable structured context| PL

    CH -->|old turns compressed| TS
    PL -->|plan / observations / status| ES

State Transition Diagram

stateDiagram-v2
    [*] --> planning
    planning --> running: plan created
    running --> waiting_user: ask_user step
    waiting_user --> planning: user replies
    running --> completed: respond step finished
    running --> failed: step error
    failed --> planning: retry / continue / config-trader reset
    completed --> planning: new relevant request or retry flow

Known Design Tradeoffs

Strengths

separates short-term chat from durable task summary
allows blocked flows to resume
supports replanning after every meaningful step
can recover from stale assumptions better for dynamic config/trader requests

Weaknesses

TaskState is still summary-driven, so summarization quality matters
planner still depends on model compliance for some transitions
ExecutionState is single-track per user, not multiple concurrent workflows
config/trader intent detection is heuristic and keyword-based

Practical Guidance

When to trust `TaskState`

Trust it for:

user intent continuity
open loops
durable facts

Do not trust it for:

whether current exchange/model/trader config exists now
whether a specific operational action is currently possible

When to trust `ExecutionState`

Trust it for:

current plan continuity
exact blocked step
latest observation chain

Do not trust it blindly when:

user has changed configuration outside the chat
the system capabilities changed after deployment

When to fetch live state again

Always prefer fresh tool snapshots before answering about:

existing model configs
existing exchange configs
existing traders
whether trader creation can proceed

Suggested Future Improvements

add workflow versioning so capability changes invalidate stale ExecutionState
separate waiting_user_confirmation from generic waiting_user
introduce code-level handling for short confirmations such as 是, 好, 继续
move dynamic state refresh from heuristic to explicit planner preflight stage
support multiple concurrent execution sessions per user if needed

11 KiB Raw Blame History

NOFXi Agent Memory And Planning Design

Purpose

High-Level Model

State Layers

1. chatHistory

2. TaskState

3. ExecutionState

Data Flow

Request Entry

Planner Flow

Short-Term vs Durable Memory

What lives in chatHistory

What lives in TaskState

What lives in ExecutionState

Planning Logic

Plan Creation

Step Execution

Resume Behavior

Dynamic State Refresh

Reset Strategy

Compression Design

Current Architecture Diagram

Memory Relationship Diagram

State Transition Diagram

Known Design Tradeoffs

Strengths

Weaknesses

Practical Guidance

When to trust TaskState

When to trust ExecutionState

When to fetch live state again

Suggested Future Improvements

11 KiB

Raw Blame History

1. `chatHistory`

2. `TaskState`

3. `ExecutionState`

What lives in `chatHistory`

What lives in `TaskState`

What lives in `ExecutionState`

When to trust `TaskState`

When to trust `ExecutionState`