Agent¶
The stateful façade you drive: infer_states to perceive, sample_action to act.
Agent ¶
Agent(
model: LinearGaussianModel,
objective: ObservationGoal | StateGoal | None = None,
*,
selector: ActionSelector | None = None,
backend: InferenceBackend | None = None,
)
A continuous active-inference agent — the continuous sibling of pymdp's Agent.
The backend and action selector underneath are pure: belief in, belief out;
belief + preference in, action out. The Agent is the one stateful piece
— it owns the
current belief (the continuous analog of pymdp's qs) and carries it
forward across calls, so you drive it in the same perceive → act loop pymdp
users already know::
agent = Agent(model, StateGoal(target))
belief = agent.infer_states(observation) # perceive
action = agent.sample_action() # act
The agent remembers the action it last sampled and feeds it to its own
predict step on the next infer_states — so you never thread actions back
in by hand, and a perceive-only model (no control matrix) needs no action at
all. This mirrors the LQG loop exactly: the filter predicts with the action
that was actually applied to the plant between observations.
The vocabulary maps onto pymdp's like this:
================== ============================ ==============================
pymdp (discrete) cpomdp (continuous) role
================== ============================ ==============================
Agent Agent the stateful façade
qs belief posterior over the state
infer_states infer_states fold an observation in
sample_action sample_action choose an action
C objective a StateGoal or ObservationGoal
D model.prior belief before any observation
================== ============================ ==============================
One honest difference from pymdp: sample_action is deterministic here,
not a draw from a policy posterior. For a fixed linear-Gaussian sensor the
EFE-minimising action is the LQR optimum (ADR-003), so it returns the single
best action. The name keeps the pymdp muscle-memory; the behaviour is exact.
An agent built without an objective is a pure tracker: infer_states
works, but sample_action raises — there is nothing to act toward.
Build an agent over model, optionally one that can act.
The objective's type selects the regime: a StateGoal steers in state
space via LQR (it needs a fixed sensor); an ObservationGoal seeks a
preferred observation via one-step EFE (it needs a control matrix, and a
state-dependent sensor unless an explicit selector is given). Omit the
objective for a perceive-only tracker.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
LinearGaussianModel
|
The linear-Gaussian generative model the agent perceives and
(with an objective) acts under. Its |
required |
objective
|
ObservationGoal | StateGoal | None
|
What the agent pursues — a |
None
|
selector
|
ActionSelector | None
|
An explicit |
None
|
backend
|
InferenceBackend | None
|
The inference engine. Defaults to a per-step
|
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the objective and model are incompatible — a
|
TypeError
|
If |
Source code in src/cpomdp/agent.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 | |
qs
property
¶
Alias for :attr:belief (read-only) — pymdp muscle-memory agent.qs.
The name is pymdp's, carried over for familiarity; the object is a Gaussian
:class:Belief, not a categorical posterior. Prefer belief.
infer_states ¶
Fold one observation into the belief and return the updated belief.
The agent's current belief goes in as the prior and the posterior
comes back out and is stored — that reassignment is the recursive
filter, advanced one step. The belief is never mutated in place; each call
replaces it with a fresh Belief.
No action is passed in: the agent supplies its own last sampled action to
the predict step (zero before the first sample_action), and a
perceive-only model carries no action at all. This is the action actually
applied to the plant since the previous observation — exactly what the
Kalman predict step needs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observation
|
ArrayLike
|
The latest sensor reading, shape |
required |
Returns:
| Type | Description |
|---|---|
Belief
|
The updated belief (also stored on |
Raises:
| Type | Description |
|---|---|
ValueError
|
On a shape mismatch in |
Source code in src/cpomdp/agent.py
sample_action ¶
The action the agent's selector chooses for the current belief.
Delegates to the agent's ActionSelector, handing it the current belief
and the objective's Preference. For a StateGoal under a fixed
sensor that selection is exactly the LQR optimum, -L∞·(mean − goal) —
one matrix-vector product, front-loaded at construction (ADR-003); for an
ObservationGoal it is the EFE-minimising action over the front-loaded
candidate grid. Deterministic, not a sample, and it takes no rng_key (see
the class docstring). There is no separate infer_policies step — policy
evaluation is folded in here, its per-cycle cost exposed as
EFESelector.cost_per_cycle. The chosen action is remembered so the next
infer_states predicts with it.
Returns:
| Type | Description |
|---|---|
Float64[Array, p]
|
The action, shape |
Raises:
| Type | Description |
|---|---|
ValueError
|
If this is a perceive-only agent (built without an
|