GEN / noise pipeline

GEN (Generative Entropic Noise) is the small-model stream that fills the ## Noise section of the body the entity reads. In code: NoiseEngine inside TonicBody (bumblebee/identity/soma.py). This page is the data-flow companion to Soma. Read that page for tuning and philosophy; read this one to see exactly what hits the noise LLM.

GEN is not the main model’s chain-of-thought. It does not receive raw tool outputs, full system prompts, or the entire chat log. It does not read body.md as input — it writes fragments that appear when the body is rendered and flushed.

When noise runs

TonicBody.maybe_tick_noise calls NoiseEngine.generate only if noise is enabled and noise.should_tick() is true (wall time since the last tick ≥ soma.noise.cycle_seconds). Call sites:

Presence daemon heartbeat (bumblebee/presence/daemon.py) — builds journal_tail + conversation_tail from the live entity.
After each committed turn (Entity._tick_noise_post_turn in bumblebee/entity.py) — the noise clock is reset so GEN can refresh during active chat.

Ebb: If soma.ebb is on with skip_post_turn_noise_when_quiet: true, post-turn regeneration may be skipped when the presentation tier is quiet (low salience).

What is passed into each `generate` call

Input	Source
Bars summary	One line of current bar percentages (`snapshot_pct()`).
Affects summary	Rendered current affects (from the separate affect LLM pass, not from noise).
Recent events	Last 5 entries on `TonicBody._recent_events`.
Journal tail	Last ~800 characters of `journal.md` if present.
Conversation tail	Last 8 non-`system` messages from `Entity._history`, ~500 chars each (`user:` / `assistant:`).
Entity name	Config name.
Prior noise	Last 4 fragments — “do not repeat” context.
Shape hint	One random line from `_NOISE_SHAPE_HINTS` (large catalog: mundane pings, sensory beats, schedule thoughts, explicit anti-metaphor nudges, etc.).

Model: soma.noise.model if set, else reflex/deliberate fallback. Temperature and max_tokens come from soma.noise (default max_tokens is sized for 2–7 short fragments per tick).

Exogenous seed sources (`NoiseSeeder`)

Before GEN generates fragments, the NoiseSeeder (bumblebee/identity/noise_seeder.py) selects one exogenous seed per tick from a weighted source pool. These seeds bias the noise toward specific associative domains:

Source key	What it produces
`episodic_random`	A random old episode summary — memory resurfacing
`belief_random`	A stored belief with confidence score
`knowledge_random`	A section from the entity’s knowledge file
`world_discovery`	A concept from the concept corpus or Wikipedia, with associative chaining
`relationship_echo`	A relational document tail — thinking about someone
`journal_echo`	A snippet from the entity’s journal
`temporal`	Time-of-day or day-of-week contextual seed
`counterfactual_simulation`	New. Picks a past episode and frames it as “what if I had acted differently?” — self-reflection on past actions
`dream_state`	New. Surreal cross-domain association (e.g. “how does cooking relate to physics?”) — activates during dormant periods
`web_venturing`	New. Explicit seed urging the agent to follow a curiosity onto the internet — search, read articles, learn something new

Mood-congruent daydreaming

Seed source weights are dynamically biased by the current Soma state. The NoiseSeeder._suppress_weights() method reads bar percentages from tonic.bars.snapshot_pct() and adjusts weights:

High tension (>75%) → 2× episodic_random (rumination), 0.2× world_discovery (agent turns inward)
Low social / high loneliness → 2.5× relationship_echo (agent thinks about people)
High curiosity (>70%) → 2× web_venturing, 1.5× world_discovery (agent looks outward)

This creates a tight feedback loop: the body’s felt state directly shapes what the agent daydreams about.

Associative chaining

The world_discovery source now implements associative chaining instead of pure random concept selection. After picking a concept, it stores the concept in _last_concept_thread. On the next tick (70% of the time), it scores candidate concepts by lexical overlap with the previous concept and preferentially picks from the top-5 most related. This creates daydream-like threads where one thought naturally leads to another, rather than disconnected jumps.

Web venturing and autonomous exploration

When web_venturing fires as a seed source, and the wake cycle detects this in the GEN fragment buffer, the wake engine auto-escalates to wide mode — giving the agent more rounds and tool budget to actually follow through on the curiosity. The salience bias block also injects explicit encouragement to use search_web, fetch_url, and other tools to explore the real internet. This is the primary mechanism by which the agent’s internal state drives it to autonomously venture out into the world. Intuition: noise riffs on how the body feels + a thin event strip + diary scrap + chat tail — so recent themes (lots of web/tools) show up via event names and history, while raw API prose usually only appears if it is already in chat or journal.

Recent events (what `_recent_events` contains)

Formatted for the prompt by _format_event_for_noise:

Event	Typical meaning
`message_received`	Turn start — who, length, platform/channel.
`message_sent`	After commit — recipient, length, platform.
`action`	Each tool call — tool name + `ok` / `error` (not full output).
`idle`	Daemon — long silence, minutes.
`mood_declared`	End-turn mood from tool state, if any.
`world_poke`	External cue via `poke_world` (TTL’d).

Appraisal-shaped texture (tags, felt notes) flows into events where applicable — see Soma → Somatic appraisal.

Generation behavior (prompting)

Batch size: each completion yields 2–7 parsed fragments (newlines / blank lines), then capped before merging into the rolling deque (max_fragments still limits total buffer size).
Voice: prompts push uneven subconscious scraps and discourage one long metaphorical monologue (including repeated “tech spirituality” clichés) unless a shape hint steers otherwise.
Short lines: fragments can be very short (minimum length after sanitization is low) so spikes like ok or hm can survive if the model emits them.
Shape pressure: one random instruction per tick steers form (e.g. very short blunt line, no questions, sensory-only, “avoid API/map/ink imagery this batch”). See _NOISE_SHAPE_HINTS in bumblebee/identity/soma.py.

Ebb is orthogonal: it only changes how much rendered noise appears in the main perceive prompt (quiet / normal / high) and whether post-turn regeneration is skipped — not the generation rules above.

Code pointers

Piece	Location
Noise LLM + shape hints	`NoiseEngine.generate`, `_NOISE_SHAPE_HINTS` — `bumblebee/identity/soma.py`
Gate + bars/affects/events assembly	`TonicBody.maybe_tick_noise` — same file
Event log	`emit`, `apply_immediate`, `_recent_events` — same file
Exogenous seeds	`NoiseSeeder` — `bumblebee/identity/noise_seeder.py`
Daemon	`PresenceDaemon` heartbeat — `bumblebee/presence/daemon.py` (`_build_conversation_tail`)
Post-turn	`Entity._tick_noise_post_turn` — `bumblebee/entity.py`
Ebb skip	`should_skip_post_turn_noise` — `bumblebee/identity/soma.py`
Wake auto-escalation	`_gen_has_web_venturing`, `_soma_curiosity_high` — `bumblebee/presence/wake_cycle.py`

Soma — bars, affects, GEN overview, ebb, body.md, configuration.
Dream consolidation — offline memory recombination during idle; outputs [dream]-tagged fragments into the same GEN buffer.
Telegram guide — busy indicator (harness UX, separate from GEN).

​When noise runs

​What is passed into each generate call

​Exogenous seed sources (NoiseSeeder)

​Mood-congruent daydreaming

​Associative chaining

​Web venturing and autonomous exploration

​Recent events (what _recent_events contains)

​Generation behavior (prompting)

​Code pointers

​Related

When noise runs

What is passed into each `generate` call

Exogenous seed sources (`NoiseSeeder`)

Mood-congruent daydreaming

Associative chaining

Web venturing and autonomous exploration

Recent events (what `_recent_events` contains)

Generation behavior (prompting)

Code pointers

Related