Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.bumbleagi.com/llms.txt

Use this file to discover all available pages before exploring further.

Configuration lives in two places:
FileScope
configs/default.yamlHarness defaults — apply to all entities
configs/entities/<name>.yamlPer-entity overrides
Entity YAML overrides harness defaults where both define the same key. The autonomy: block is a deep merge onto harness autonomy (including nested summon / poker_prompts), not a wholesale replacement.

Harness defaults

deployment

deployment:
  mode: local   # local | hybrid_railway
Overridden by BUMBLEBEE_DEPLOYMENT_MODE when set.

inference

Brain endpoint selection. Env vars such as BUMBLEBEE_INFERENCE_PROVIDER override YAML when set (see Environment variables).
inference:
  provider: ""          # local | remote_gateway | openrouter | venice — empty derives from deployment.mode
  base_url: ""          # tunnel root (gateway) or OpenAI-compat root before /v1; empty uses defaults
  api_key_env: BUMBLEBEE_INFERENCE_GATEWAY_TOKEN
  model: ""
  timeout: 120
  pass_num_ctx: true    # set false (or BUMBLEBEE_INFERENCE_PASS_NUM_CTX=false) for strict OpenAI-compat hosts
openrouter and venice are optional presets for hosted APIs used in harness / product testing; defaults and keys are described in Hosted inference (testing).

models

models:
  reflex: "gemma4:26b"
  deliberate: "gemma4:26b"
  embedding: "nomic-embed-text"

ollama

ollama:
  base_url: "http://localhost:11434"
  timeout: 120
  retry_attempts: 3
  retry_delay: 2.0

cognition

cognition:
  thinking_mode: true
  temperature: 0.75
  reflex_max_tokens: 1024
  deliberate_max_tokens: 16384
  thinking_budget: 4096
  max_context_tokens: 32768
  escalation_threshold: 0.4
  image_token_budget: 280
  tool_continuation_rounds: 21

identity

identity:
  emotion_decay_rate: 0.001
  drive_tick_interval: 120
  evolution_interval: 100
  narrative_interval: 500

memory

memory:
  database_path: "~/.bumblebee/entities/{entity_name}/memory.db"
  database_url: ""
  episode_significance_threshold: 0.3
  consolidation_interval: 7200
  memory_decay_rate: 0.0001
  max_recall_results: 10
  embedding_dimensions: 768
  narrative_every_n_consolidations: 3
  imprint_decay_half_life_seconds: 2592000
  imprint_recall_weight: 0.35

presence

presence:
  heartbeat_interval: 120
  initiative_cooldown: 1800
  typing_speed_base: 30
  typing_speed_variance: 0.3
  message_chunk_max: 400
  chunk_delay: 2.0
message_chunk_max and chunk_delay control human-paced multi-bubble delivery on Telegram (and chunk sizing on Discord). typing_speed_base / typing_speed_variance feed both the initial typing delay and the jitter + length-scaled gaps between bubbles on Telegram. See Telegram and Presence.

tools

tools:
  shell:
    enabled: true
    deny: ["rm -rf /", "sudo rm", "shutdown", "reboot", "mkfs", "dd if="]
    timeout: 30
  browser:
    enabled: false
  code:
    enabled: true
    timeout: 30
  imagegen:
    enabled: false
  voice:
    enabled: true
    voice_id: "en-US-GuyNeural"
  execution:
    base_url: ""
    allow_local: false
    require_railway: false
    workspace_dir: ""

soma

See Soma architecture for full soma configuration including bars, coupling, events, impulses (optional near_margin per rule), conflicts (optional latent_min_ratio / latent_any_ratio for brewing strain), affects, noise, wake voice, and ebb (salience-tiered body rendering in the prompt). Salience incorporates brewing conflicts and near-threshold impulses as well as active ones.
soma:
  ebb:
    enabled: true
    quiet_below: 0.30
    high_above: 0.58
    reflex_salience_scale: 0.75
    autonomous_minimum: normal
    skip_post_turn_noise_when_quiet: true
Full keys (including salience weights) match configs/default.yaml.

autonomy

autonomy:
  enabled: true
  min_cycle_gap_seconds: 600
  max_cycles_per_hour: 4
  messages_per_cycle: 2
  base_wake_interval_min: 20
  base_wake_interval_max: 45
  silence_threshold_seconds: 120
  impulse_wake: true
  drive_wake: true
  conflict_wake: true
  noise_wake: false
  desire_wake: true
  desire_wake_threshold: 0.72
  max_desires_considered: 3
  allow_tool_calls_on_wake: true
  # Sustained multi-round wake (single trigger → multiple perceive rounds)
  wake_session_max_rounds: 1
  wake_session_wall_seconds: 1200
  wake_session_say_budget_per_round: 6
  wake_session_pause_seconds: 2.0
  wake_session_extra_tool_steps: 10
  wake_wide_mode: false
  wake_wide_bonus_steps: 16
  wake_user_visible_status: true
  wake_verbose_worker_log: true
  transcript_enabled: true
  transcript_filename: autonomy_transcript.md
  transcript_path: ""           # optional; absolute or relative to workspace / entity dir
  wake_chat_tool_activity: false      # mirror per-tool lines to Telegram during autonomous wake
  poker_prompts:
    enabled: false
    time_weighted: true
    mode: blend              # blend | replace_wake_voice
    prompts_path: ""       # default: configs/poker_prompts/default.yaml
    ground_with_gen: true
    grounding_model: ""
    grounding_temperature: 0.72
    grounding_max_tokens: 300
  summon:
    enabled: true
    timeout_seconds: 30
Set autonomy.enabled: false to turn off autonomous wake (no timer or full wake perceive cycles). The daemon then skips wake evaluation; legacy drive-based initiative may still send occasional proactive messages unless you tune cooldowns — see Autonomous wake → Disabling autonomous wake and Presence → Initiative and wake cycles. See Autonomous wake & poker prompts for sustained sessions, wide mode, visibility, logging, poker decks, and GEN grounding.

Entity YAML

Entity files can override any harness default and add entity-specific configuration:

Unique to entities

KeyPurpose
nameEntity display name
createdCreation timestamp
personalityCore traits, behavioral patterns, voice, backstory
drivesCuriosity topics, attachment threshold, initiative cooldown
presence.platformsPlatform list (CLI, Telegram, Discord)
automationsScheduled routines, emergence, journal
mcp_serversMCP stdio server definitions

Overridable from harness

KeyExample
cognition.reflex_modelUse a different reflex model
cognition.deliberate_modelUse a different deliberate model
cognition.thinking_modeEnable/disable thinking per entity
cognition.max_context_tokensPer-entity context window
cognition.history_compressionPer-entity compaction settings
tools.*Enable/disable tools per entity
firecrawlPer-entity Firecrawl settings
autonomyMerge onto harness autonomy (wake sessions, poker, wide mode, visibility) — see Autonomous wake
The harness file (configs/default.yaml) does not list every autonomy key in the table above; defaults in code match the repository. Prefer copying from configs/default.yaml or configs/entities/canary.example.yaml for up-to-date snippets. See configs/entities/example.yaml for a full template with comments.