Configuration - Bumblebee

Configuration lives in two places:

File	Scope
`configs/default.yaml`	Harness defaults — apply to all entities
`configs/entities/<name>.yaml`	Per-entity overrides

Entity YAML overrides harness defaults where both define the same key. The autonomy: block is a deep merge onto harness autonomy (including nested summon / poker_prompts), not a wholesale replacement.

Harness defaults

deployment

deployment:
  mode: local   # local | hybrid_railway

Overridden by BUMBLEBEE_DEPLOYMENT_MODE when set.

inference

Brain endpoint selection. Env vars such as BUMBLEBEE_INFERENCE_PROVIDER override YAML when set (see Environment variables).

inference:
  provider: ""          # local | remote_gateway | openrouter | venice — empty derives from deployment.mode
  base_url: ""          # tunnel root (gateway) or OpenAI-compat root before /v1; empty uses defaults
  api_key_env: BUMBLEBEE_INFERENCE_GATEWAY_TOKEN
  model: ""
  timeout: 120
  pass_num_ctx: true    # set false (or BUMBLEBEE_INFERENCE_PASS_NUM_CTX=false) for strict OpenAI-compat hosts

openrouter and venice are optional presets for hosted APIs used in harness / product testing; defaults and keys are described in Hosted inference (testing).

models

models:
  reflex: "gemma4:26b"
  deliberate: "gemma4:26b"
  embedding: "nomic-embed-text"

ollama

ollama:
  base_url: "http://localhost:11434"
  timeout: 120
  retry_attempts: 3
  retry_delay: 2.0

cognition

cognition:
  thinking_mode: true
  temperature: 0.75
  reflex_max_tokens: 1024
  deliberate_max_tokens: 16384
  thinking_budget: 4096
  max_context_tokens: 32768
  escalation_threshold: 0.4
  image_token_budget: 280
  tool_continuation_rounds: 21

identity

identity:
  emotion_decay_rate: 0.001
  drive_tick_interval: 120
  evolution_interval: 100
  narrative_interval: 500

memory

memory:
  database_path: "~/.bumblebee/entities/{entity_name}/memory.db"
  database_url: ""
  episode_significance_threshold: 0.3
  consolidation_interval: 7200
  memory_decay_rate: 0.0001
  max_recall_results: 10
  embedding_dimensions: 768
  narrative_every_n_consolidations: 3
  imprint_decay_half_life_seconds: 2592000
  imprint_recall_weight: 0.35

presence

presence:
  heartbeat_interval: 120
  initiative_cooldown: 1800
  typing_speed_base: 30
  typing_speed_variance: 0.3
  message_chunk_max: 400
  chunk_delay: 2.0

message_chunk_max and chunk_delay control human-paced multi-bubble delivery on Telegram (and chunk sizing on Discord). typing_speed_base / typing_speed_variance feed both the initial typing delay and the jitter + length-scaled gaps between bubbles on Telegram. See Telegram and Presence.

tools

tools:
  shell:
    enabled: true
    deny: ["rm -rf /", "sudo rm", "shutdown", "reboot", "mkfs", "dd if="]
    timeout: 30
  browser:
    enabled: false
  code:
    enabled: true
    timeout: 30
  imagegen:
    enabled: false
  voice:
    enabled: true
    voice_id: "en-US-GuyNeural"
  execution:
    base_url: ""
    allow_local: false
    require_railway: false
    workspace_dir: ""

soma

See Soma architecture for full soma configuration including bars, coupling, events, impulses (optional near_margin per rule), conflicts (optional latent_min_ratio / latent_any_ratio for brewing strain), affects, noise, wake voice, and ebb (salience-tiered body rendering in the prompt). Salience incorporates brewing conflicts and near-threshold impulses as well as active ones.

soma:
  ebb:
    enabled: true
    quiet_below: 0.30
    high_above: 0.58
    reflex_salience_scale: 0.75
    autonomous_minimum: normal
    skip_post_turn_noise_when_quiet: true

Full keys (including salience weights) match configs/default.yaml. Dreams (offline memory recombination during idle) are configured under soma.dreams. See Dream consolidation.

soma:
  dreams:
    enabled: false
    min_silence_seconds: 3600
    min_gap_seconds: 14400
    dream_hours: [0, 1, 2, 3, 4, 5]
    temperature: 1.15

autonomy

autonomy:
  enabled: true
  min_cycle_gap_seconds: 600
  max_cycles_per_hour: 4
  messages_per_cycle: 2
  base_wake_interval_min: 20
  base_wake_interval_max: 45
  silence_threshold_seconds: 120
  impulse_wake: true
  drive_wake: true
  conflict_wake: true
  noise_wake: false
  desire_wake: true
  desire_wake_threshold: 0.72
  max_desires_considered: 3
  allow_tool_calls_on_wake: true
  # Sustained multi-round wake (single trigger → multiple perceive rounds)
  wake_session_max_rounds: 1
  wake_session_wall_seconds: 1200
  wake_session_say_budget_per_round: 6
  wake_session_pause_seconds: 2.0
  wake_session_extra_tool_steps: 10
  wake_wide_mode: false
  wake_wide_bonus_steps: 16
  wake_user_visible_status: true
  wake_verbose_worker_log: true
  transcript_enabled: true
  transcript_filename: autonomy_transcript.md
  transcript_path: ""           # optional; absolute or relative to workspace / entity dir
  wake_chat_tool_activity: false      # mirror per-tool lines to Telegram during autonomous wake
  poker_prompts:
    enabled: false
    time_weighted: true
    mode: blend              # blend | replace_wake_voice
    prompts_path: ""       # default: configs/poker_prompts/default.yaml
    ground_with_gen: true
    grounding_model: ""
    grounding_temperature: 0.72
    grounding_max_tokens: 300
  summon:
    enabled: true
    timeout_seconds: 30

Set autonomy.enabled: false to turn off autonomous wake (no timer or full wake perceive cycles). The daemon then skips wake evaluation; legacy drive-based initiative may still send occasional proactive messages unless you tune cooldowns — see Autonomous wake → Disabling autonomous wake and Presence → Initiative and wake cycles. See Autonomous wake & poker prompts for sustained sessions, wide mode, visibility, logging, poker decks, and GEN grounding.

Entity YAML

Entity files can override any harness default and add entity-specific configuration:

Unique to entities

Key	Purpose
`name`	Entity display name
`created`	Creation timestamp
`personality`	Core traits, behavioral patterns, voice, backstory
`drives`	Curiosity topics, attachment threshold, initiative cooldown
`presence.platforms`	Platform list (CLI, Telegram, Discord)
`automations`	Scheduled routines, emergence, journal
`mcp_servers`	MCP stdio server definitions

Overridable from harness

Key	Example
`cognition.reflex_model`	Use a different reflex model
`cognition.deliberate_model`	Use a different deliberate model
`cognition.thinking_mode`	Enable/disable thinking per entity
`cognition.max_context_tokens`	Per-entity context window
`cognition.history_compression`	Per-entity compaction settings
`tools.*`	Enable/disable tools per entity
`firecrawl`	Per-entity Firecrawl settings
`autonomy`	Merge onto harness `autonomy` (wake sessions, poker, wide mode, visibility) — see Autonomous wake

The harness file (configs/default.yaml) does not list every autonomy key in the table above; defaults in code match the repository. Prefer copying from configs/default.yaml or configs/entities/canary.example.yaml for up-to-date snippets. See configs/entities/example.yaml for a full template with comments.

​Harness defaults

​deployment

​inference

​models

​ollama

​cognition

​identity

​memory

​presence

​tools

​soma

​autonomy

​Entity YAML

​Unique to entities

​Overridable from harness