Documentation Index
Fetch the complete documentation index at: https://docs.bumbleagi.com/llms.txt
Use this file to discover all available pages before exploring further.
The entity has a continuous internal state that runs independently of its reasoning. Three layers of felt experience, each on its own cadence:
| Layer | Mechanism | Cadence | What it does |
|---|
| Bars | Pure math | Each daemon heartbeat (default 120s from harness presence.heartbeat_interval; override with presence.daemon.heartbeat_interval, minimum 5s) | Decay, coupling, momentum, impulse detection (live / cooling / near-threshold), and conflict detection (active vs brewing) |
| Affects | LLM-derived | soma.affect_cycle_seconds (default 240 = 4 min) | Layered felt-textures: Surface and Undercurrents use the vocabulary of ~167 named affects; optional Edge line names a free-text blend or hybrid pull (AFFECT_VOCABULARY in code) |
| GEN | Generative LLM | soma.noise.cycle_seconds (default 90) | Sporadic inner scraps — 2–7 fragments/tick, rotating shape hints, high temperature; see GEN / noise pipeline |
On each heartbeat the daemon runs bars, then affects (if due), then noise (if due) — not three unrelated timers for bars vs affects vs GEN.
The agent reads its body each turn as part of the context preamble. How much of that body (especially GEN lines) appears in the prompt can ebb with internal salience — see Ebb below. It cannot set its own body state — only read it. The body is a signal, not a command. The main model interprets its body naturally; no prescribed emotions, no state machine.
Tuning soma behavior
Soma is fully user-editable. Most teams tune this in:
configs/default.yaml (project-wide defaults)
configs/entities/<name>.yaml (per-entity overrides)
The key tuning area is soma:. You can make behavior softer or more aggressive by adjusting:
bars.variables[*].decay_rate (how quickly drives return to baseline)
event_effects (how strongly events move bars)
impulses[*].threshold, cooldown_minutes, and optional near_margin (how close below the threshold still registers as “at the edge”)
conflicts[*].tension_per_tick and comfort_per_tick (friction intensity), plus optional latent_min_ratio / latent_any_ratio (brewing band before both drives cross the threshold)
appraisal, noise, and wake_voice temperatures/cadences
ebb — salience-tiered body text in the prompt (see Ebb)
Quick preset ideas
These are not built-in profiles yet, but practical starting points.
# Soft / calm
soma:
event_effects:
message_received: { social: 1.0, curiosity: 0.3 }
impulses:
- drive: social
threshold: 88
cooldown_minutes: 45
noise:
temperature: 0.8
cycle_seconds: 140
# Aggressive / high-reactivity
soma:
event_effects:
message_received: { social: 3.5, curiosity: 1.2 }
impulses:
- drive: social
threshold: 72
cooldown_minutes: 15
conflicts:
- drives: [curiosity, comfort]
threshold: 60
tension_per_tick: 0.18
noise:
temperature: 1.25
cycle_seconds: 60
Tip: tune one subsystem at a time (bars -> impulses -> noise) and observe for a full day before making the next change.
External world pokes and desire pressure
Soma can now ingest loose external cues via world_poke events. These cues enter the same internal stream as other soma events, so GEN can riff on them and autonomous wake logic can factor them into “desire pressure” (explicit ranked urges like reach_out, explore, create, resolve_tension).
At wake time, autonomy can trigger not only from impulses/conflicts/noise, but from top-ranked desire urgency:
autonomy:
desire_wake: true
desire_wake_threshold: 0.72
max_desires_considered: 3
allow_tool_calls_on_wake: true
This creates a bridge from external world signal -> subconscious pressure -> autonomous action.
Bars
Five quantitative drives that accumulate and decay toward a resting point. Shipped defaults (configs/default.yaml) look like this (excerpt):
soma:
bars:
variables:
# decay_rate is homeostatic: % of distance to resting point
# closed per hour toward `initial` (e.g. -15 → 15% of the gap per hour).
- name: social
initial: 50
decay_rate: -15.0
floor: 0
ceiling: 100
- name: curiosity
initial: 50
decay_rate: -10.0
floor: 0
ceiling: 100
- name: creative
initial: 40
decay_rate: -12.0
floor: 0
ceiling: 100
- name: tension
initial: 15
decay_rate: -6.0
floor: 0
ceiling: 100
- name: comfort
initial: 65
decay_rate: -3.0
floor: 0
ceiling: 100
momentum_window: 6
Decay is homeostatic — each bar is pulled toward its resting point (initial) with force proportional to its distance from that point. This gives bars natural equilibrium without per-event tuning: they respond to activity but settle back toward baseline during silence.
Allostasis (dynamic set points)
Bar baselines are no longer fixed. On each tick, the initial resting point for every bar drifts slowly toward its current chronic value — 0.5% of the gap per hour. If the agent is persistently stressed, its “normal” tension baseline rises over days. If it spends a long calm period, the baseline settles lower. This produces organic long-term personality drift without explicit configuration.
Drifted initial values are persisted across restarts in soma-state.json / entity_state DB, so the agent’s evolved baselines survive reboots.
Circadian rhythms
Decay rates are modulated by a sine-wave multiplier based on the local hour of day. The multiplier peaks around 14:00 (faster emotional equilibrium in the afternoon) and troughs around 02:00 (emotions linger overnight). The amplitude is ±15%, so the effect is subtle but organic — late-night tension builds more slowly, afternoon states resolve faster.
Circadian modulation also applies to tension coupling, so conflict-driven tension accumulates differently depending on time of day.
Somatic memory (“gut reactions”)
The BarEngine now supports somatic markers — persistent bindings between a specific external entity (e.g. a person_id) and an instant bar-shift. When a person with a registered marker sends a message, their somatic marker fires before any LLM appraisal runs, producing an immediate “gut reaction” in the body.
Markers accumulate over time via register_somatic_marker(source_id, effect) and fire via trigger_somatic_marker(source_id). They are persisted alongside bar state.
# Example: register a warm friend
bars.register_somatic_marker("user_12345", {"comfort": 3.0, "tension": -2.0})
# When that person messages, before appraisal:
bars.trigger_somatic_marker("user_12345") # instant +comfort, -tension
The trigger hook is wired into Entity._somatic_appraise_input() — the body reacts to who is talking before it even reads what they said.
Events push bars in response to activity. Positive deltas are attenuated by headroom scaling: each positive bump is multiplied by (ceiling - current_value) / ceiling, so the bar gets the full listed effect near the floor and none at the ceiling. Negative deltas pass through unchanged. This prevents bars from pegging to maximum during active conversation:
event_effects:
message_received: { social: 2, curiosity: 0.5 }
message_sent: { social: 1, creative: 0.5 }
action: { curiosity: 2 }
idle: { social: -0.015, curiosity: 0.01 }
idle_cycle: { comfort: 3, tension: -2 }
mood_declared: { comfort: 1 }
Coupling
Bars influence each other. When one drive is elevated, it can accelerate or dampen another:
coupling:
- when: "social > 80"
effect: "curiosity.decay_rate *= 1.5"
- when: "tension > 70"
effect: "comfort.decay_rate *= 2.0"
Impulses
When a drive crosses its threshold, that impulse is active: it appears in body.md, can apply relief to bars (when not on cooldown), and can feed autonomous wake when autonomy.impulse_wake is true.
The engine also tracks:
- Phase —
live (above threshold, off cooldown) vs cooling (above threshold but inside cooldown). Cooling rows show approximate time left on the cooldown.
- Surge — whether the drive is rising, ebbing, or steady relative to recent momentum.
- Near threshold — if the drive is below the threshold but within
near_margin points of it (default 15), a separate “At the threshold” line appears: anticipatory pull without a full fire. Optional per impulse: near_margin.
impulses:
- drive: social
threshold: 80
type: reach_out
label: reach_out
cooldown_minutes: 30
relief: { social: -25 }
# optional:
# near_margin: 15
Conflicts
When opposing drives are both at or above the rule’s threshold, the conflict is active: configured tension_per_tick / comfort_per_tick (and optional ceiling) apply on each bar tick.
Before that collision happens, the same YAML rule can show as brewing (latent): both drives sit in a pressure band — the lower drive above threshold * latent_min_ratio (default 0.42) and the higher drive above threshold * latent_any_ratio (default 0.82). Brewing conflicts do not apply tick friction; they surface in body.md and in salience so the entity can feel strain building. Optional per rule: latent_min_ratio, latent_any_ratio.
Each active or brewing row includes tilt (which drive leads numerically, or balanced) and heat (paired momentum: heating, cooling, shearing, or mixed).
conflicts:
- drives: [curiosity, comfort]
threshold: 70
label: "restless comfort"
tension_per_tick: 0.08
tension_ceiling: 65
comfort_per_tick: -0.15
# optional brewing band (defaults shown):
# latent_min_ratio: 0.42
# latent_any_ratio: 0.82
Somatic appraisal
Previously, all message_received events produced the same flat bar bump regardless of what was said. Somatic appraisal makes bar effects content-aware.
The somatic appraiser is a fast LLM pass that reads what was actually said and translates it into context-sensitive bar effects. It runs twice per perceive cycle:
- Input appraisal — before the agent reads its body. A confrontational message raises tension; a warm one fills social; an intellectually stimulating one spikes curiosity.
- Interaction appraisal — after the agent responds. How the full exchange felt — expressing creativity bumps creative, connecting with someone raises social beyond the baseline.
The appraiser sends message text, current bar state, and person name to the reflex model at low temperature (0.3) with a tight 120-token budget. Results include dynamic bar deltas, semantic tags, and a felt note:
curiosity: +6 — intellectually stimulating
social: +4 — continuing a thread
comfort: +2 — familiar topic
tags: warm, intellectually stimulating, personal
felt: something landed that I want to think about more
The tags and felt note flow into recent events where the affect engine and GEN can reference them — so the subconscious riffs on the actual emotional texture of what happened, not just “a message arrived.”
Appraisal effects are applied immediately via apply_immediate(), not deferred to the next daemon heartbeat. The agent’s body state already reflects the emotional content of the message by the time it reads body.md.
soma:
appraisal:
enabled: true # disable to fall back to flat event_effects
temperature: 0.3
max_tokens: 120
Static event_effects still fire as a baseline, so bars move even if the appraisal call fails.
Affects
On each affect tick (default every 240 seconds), a structured LLM call derives felt-textures from a vocabulary of ~167 affects (AFFECT_VOCABULARY in bumblebee/identity/soma.py), organized into categories — warm/connective, energetic/expansive, curious/seeking, creative/generative, heavy/contractive, tense/guarded, withdrawn/inward, social/relational, complex/liminal, temporal/existential, body/somatic, and cognitive/meta.
The prompt is layered and continuity-aware:
- Inputs include bar levels, per-drive momentum, summarized structural strain (active and brewing conflicts with tilt/heat), the full impulse field (live, cooling, near-threshold), recent events, and previous affects so textures can evolve instead of reshuffling every tick.
- Output (intended) uses three sections — SURFACE (1–3 vocabulary affects), UNDERCURRENTS (0–3 quieter vocabulary affects), and EDGE (optional free text naming a blend or unresolved hybrid; not a vocabulary name). If the model returns the older flat line format (one affect per line, no headers), the parser still accepts it and treats lines as Surface.
There is no emotion state machine and no dice rolls — the LLM interprets the numbers and produces a felt description the main model reads alongside bars and noise.
body.md renders affects as separate blocks (Surface / Undercurrents / Edge) when layered data is present.
Generative Entropic Noise (GEN)
GEN is a design primitive unique to Bumblebee. A second model produces continuous internal commentary — raw associative material — that the main model reads as its own stream of consciousness.
The idea
LLM agents today are purely phasic. They activate on input, reason, respond, and return to nothing. Between turns, they have no inner life. No thoughts accumulate. No associations form. The agent comes back cold every time.
GEN changes this. A small model runs on a background timer (every ~90 seconds by default), reading the entity’s current body state and producing 2–7 short internal scraps per tick (uneven length encouraged). Each tick injects a random shape hint from a large catalog so tone does not freeze into one metaphorical register. Fragments accumulate in a rolling buffer that the main model reads as part of its body state each turn. The entity was “thinking” the whole time it was silent.
For the exact prompt inputs (bars, events, journal, history, prior fragments), see GEN / noise pipeline.
How it works
Each daemon heartbeat (same interval as bar ticks) runs, in order: tick_bars(dt_hours) with dt_hours derived from the heartbeat interval, then maybe_tick_affects() if affect_cycle_seconds has elapsed, then maybe_tick_noise() if noise.cycle_seconds has elapsed.
| Operation | When it runs | Mechanism |
|---|
tick_bars() | Every heartbeat | Decay, coupling, momentum, pending events |
maybe_tick_affects() | When affect_cycle_seconds has elapsed since last affect pass | Structured LLM call on the reflex model |
maybe_tick_noise() | When noise.cycle_seconds has elapsed (also reset after a committed reply — see GEN / noise pipeline) | Generative LLM call; mode coherent vs entropic is chosen inside _noise_generation_mode() from current salience + recent signal-heavy events + conversation/journal load |
The noise output feeds into render_body(), which assembles bars, affects, noise fragments, conflicts (active + brewing), and impulses (live, cooling, and near-threshold). Ebb (when enabled) chooses a quiet, normal, or high presentation for the perceive prompt — compact drive lines and fewer noise lines when salience is low, full layout when it is high. On-disk body.md is always flushed with the high (full) layout regardless of ebb.
What the entity sees
Each turn, the main model receives a body state section that includes the noise buffer:
## Noise
that thing kai said about the deploy — i keep circling back to it
why do i always get curious about infrastructure at 2am
something about the gap between what someone asks and what they actually want to know
The main model does not know this came from a different model. It reads the fragments as its own inner voice and integrates them naturally — or ignores them. Most noise gets discarded. That is how actual thought works.
Three properties
The noise model cannot act. It has no tools, sends no messages, and mutates no state. It writes fragments into a buffer. The main model — the “watcher” — decides what to do with them. This separation mirrors the soma design: the body produces signals, the conscious agent interprets them.
High temperature is the point. GEN runs at temperature 1.05 by default (soma.noise.temperature in configs/default.yaml). The value is not in accuracy — it is in unexpected juxtaposition, the same thing that makes dreams useful. The main model has the judgment to discard bad noise and amplify good noise. The small model’s job is to generate raw material the watcher would not produce on its own.
Coherent vs entropic. Each generate call is labeled coherent or entropic (prompt guidance differs). High-signal conditions — salience near the high tier, multiple recent message_received / message_sent / action / appraisal / world_poke events, or a loaded conversation/journal tail — bias toward coherent (more pointed, still noisy). Quiet internal state biases toward entropic (sparser, jumpier associations). See _noise_generation_mode() in bumblebee/identity/soma.py.
It runs between turns and after most turns. During silence, GEN ticks on its daemon timer (~90s). During active conversation, GEN usually regenerates after each perceive cycle — digesting the exchange that just happened — unless ebb is in a quiet tier and skip_post_turn_noise_when_quiet is enabled (then the post-turn GEN tick is skipped so calm chat stays calm). The rolling buffer still holds prior fragments. After 30 minutes of silence, the body already contains accumulated inner voice. After a rapid-fire conversation, GEN has typically processed every exchange.
GEN and autonomous wake
At autonomous wake, optional poker grounding (autonomy.poker_prompts.ground_with_gen) passes the current noise fragments into a short reflex call together with a YAML seed, soma events, journal tail, and relationships — so the wake disposition can emerge from lived signal as well as the deck. See Autonomous wake & poker prompts.
What GEN is not
GEN is not chain-of-thought reasoning. It is not multi-agent debate. It is not retrieval-augmented generation. It is a second model producing raw associative text that the primary model treats as its own thoughts. The analogy is closer to the relationship between the subconscious and the executive mind — the subconscious generates, the executive filters.
What GEN reads
GEN receives rich context from across the system — not thin structural events but actual substance to riff on:
| Source | What it provides |
|---|
| Bars | Current drive percentages |
| Affects | Rendered affect block (Surface / Undercurrents / Edge when layered) |
| Recent events | Semantically formatted: appraisal tags and felt notes, who spoke, what tools were used, how long the silence has been |
| Journal tail | Last 800 chars of the entity’s journal |
| Conversation tail | Last 8 messages at up to 500 chars each — enough to riff on real exchanges |
| Prior noise | Last few fragments — anti-repeat |
| Shape hint | One random stylistic constraint per tick |
Full event types, trigger diagram, and code map: GEN / noise pipeline.
Configuration
soma:
noise:
enabled: true
model: "" # empty = reflex model (no extra VRAM)
cycle_seconds: 90
temperature: 1.05 # high for associative, lateral material
max_tokens: 240 # room for 2–7 short fragments per tick
max_fragments: 8 # rolling buffer size (oldest drop off)
Model selection and GPU impact
When model is empty (the default), noise runs on the same reflex model already loaded in memory. No additional VRAM. No model swapping. The noise prompt is small (~300 tokens in, ~100 tokens out) so each tick costs about 0.5–1 second of inference on a model that is already warm. One extra call per minute on a model you are already running.
Setting a dedicated small model (e.g. gemma3:1b) gives the noise a different character — smaller models tend to be more associative and less structured — but requires that model to be loaded or swapped in. On single-GPU setups with tight VRAM, the default (empty) is recommended.
Ebb
Soma state runs continuously, but humans do not narrate their entire subconscious on every utterance. Ebb scales how much body + GEN appears in the model prompt each turn, while the engine keeps ticking in the background.
Salience
A salience score from 0 to 1 combines (with configurable weights):
- Bar deviation — mean distance of each bar from its YAML resting (
initial) value
- Conflict — intensity of active conflicts, blended with brewing (latent) conflict intensity (so strain building below full collision still raises salience)
- Impulse — intensity of impulses that are live (off cooldown), blended with near-threshold proximity (anticipatory pull)
- Affect load — how many affect entries are active (capped)
- GEN fill — how full the noise fragment buffer is relative to
max_fragments
Tiers
| Tier | Typical use |
|---|
| quiet | Salience below quiet_below — compact one-line drives, at most quiet_max_noise_lines GEN lines; Conflicts / Impulses sections omitted when the rendered text is the empty placeholder (no structural strain / no pull signal) |
| normal | Between quiet_below and high_above — one-line drives, capped GEN lines (normal_max_noise_lines) |
| high | At or above high_above — full glyph bars, up to high_max_noise_lines GEN lines (same as historical full body) |
Reflex routes multiply salience by reflex_salience_scale before tiering, so reflex turns skew quieter than deliberate ones at the same body state.
Autonomous (platform="autonomous") and automation (platform="automation") turns apply autonomous_minimum as a floor (default normal), so internal wake cycles are not stuck in whisper mode when the body is calm.
Persistence and status
body.md (and flush_body_md) always use the high layout — full detail for operators and on-disk continuity.
ebb.enabled: false restores the legacy behavior: every turn injects the full body block into the prompt.
Configuration
soma:
ebb:
enabled: true
weights:
bar_deviation: 0.38
conflict: 0.22
impulse: 0.18
affect_load: 0.12
noise_fill: 0.10
quiet_below: 0.30
high_above: 0.58
reflex_salience_scale: 0.75
autonomous_minimum: normal # quiet | normal | high
quiet_max_noise_lines: 1
normal_max_noise_lines: 3
high_max_noise_lines: 4 # 0 = default to 4
skip_post_turn_noise_when_quiet: true
Weights are normalized to sum to 1 when the engine loads.
Wake voice
When an autonomous wake condition fires, a subconscious wake voice generates the prompt — a first-person stirring that the conscious agent receives when it wakes. This is separate from GEN; it runs only on wake events, not on a timer. Optional poker prompts can appear alongside wake voice (blend) or replace it (replace_wake_voice); see Autonomous wake & poker prompts.
soma:
wake_voice:
enabled: true
model: "" # empty = reflex model
temperature: 0.8
max_tokens: 300
body.md — the read-only interface
All three soma layers render into a single body.md file at ~/.bumblebee/entities/{name}/soma/body.md. This file is the canonical interface between soma subsystems and the main agent.
The agent reads body.md but never writes to it. Only soma subsystems (bars, affects, noise, appraisal) update this file.
On Telegram, /body sends the raw on-disk soma/body.md from the execution host (local workspace or Railway via the same read_file path as tools), in <pre> blocks — not via the main model. See Telegram.
body.md is flushed after every state mutation — bar tick, affect derivation, noise generation, somatic appraisal, and state restore. It always contains the current rendered body state at full (high) detail, independent of ebb tiering used only in the per-turn prompt:
## Bars
social ████████░░ strong ↑
curiosity ██████░░░░ moderate —
creative ████░░░░░░ mild ↓
tension ██░░░░░░░░ low —
comfort ███████░░░ strong —
## Affects
Surface:
· fascination (vivid) — locked on
· warmth (present) — soft edges
Undercurrents:
· restlessness (faint) — searching for a thread
Edge:
curiosity and comfort in a slow tug — not enough friction to name a winner
## Noise
wonder if alice meant that literally or if she was testing something
that music theory thing keeps coming back, there's a shape there i'm not seeing
should probably write about this before it fades
## Conflicts
(no structural strain — no paired drives are colliding yet)
## Impulses
(no pull signal — thresholds quiet, nothing crowding the edge)
When conflicts or impulses are present, Conflicts uses ⚡ for active rows and ◌ for brewing rows, with a second line for tilt, heat, and per-drive levels. Impulses groups Live, Cooling, and At the threshold subsections.
State persistence
All three soma layers persist across restarts — the entity wakes up with continuity, not a blank subconscious.
| Layer | What is saved | Storage |
|---|
| Bars | Values, allostatic baselines (initial), history, momentum, somatic markers | soma-state.json / entity_state DB (soma_state_v2) |
| Affects | Active affect entries (vocabulary lines + optional edge blend text) with intensity and notes | Same file / same table |
| Noise | Rolling fragment buffer | Same file / same table |
On restore, offline decay is applied to bars based on how long the process was down (capped at 24 hours), using exponential approach toward the resting point. Allostatic baselines and somatic markers are restored from the saved state — so the agent’s evolved personality and gut reactions carry across restarts. Affects are loaded back as-is. Noise fragments are re-sanitized through the same cleanup pipeline used during generation — this strips any model markup that was persisted before sanitization rules were added. body.md is flushed immediately on startup.
Backward compatible — filesystem restore falls back to legacy soma-bar-state.json, DB restore falls back to soma_bar_state_v1.
Source files
| File | Role |
|---|
bumblebee/identity/soma.py | Soma engine: bars, affects, noise, appraisal, body renderer, persistence |
bumblebee/presence/wake_cycle.py | Autonomous wake triggers, context assembly, wake voice + poker pipeline |
bumblebee/cognition/poker_prompts.py | Poker deck load and time-weighted selection |
bumblebee/cognition/poker_grounding.py | Optional GEN + context weave for deck seeds |
bumblebee/presence/daemon.py | Heartbeat loop that ticks bars, affects, and noise |
bumblebee/entity.py | Wires appraisal and per-turn noise into the perceive pipeline |