Documentation Index
Fetch the complete documentation index at: https://docs.bumbleagi.com/llms.txt
Use this file to discover all available pages before exploring further.

Who it’s for
Assistant-first
You want a harness purpose-built for Gemma 4, not a generic framework with the model bolted on. Use Bumblebee as a serious local agent: multi-step tools, retrieval, and extensibility—whether or not you lean into “digital entity” framing.
Local inference, not API bills
If you are tired of large monthly hosted API spend (for example high-volume calls to frontier inference APIs), you can run open weights on a consumer GPU and get agentic behavior in the same league as strong open-source stacks—without per-token metering for your default path.
Persistent presence
You want the entitative path: memory that accrues, a voice that evolves, soma and GEN, proactive wake cycles—the “one self across platforms” design. That is Bumblebee’s center of gravity; the assistant use case rides the same harness.
Why Bumblebee
- For builders
- For researchers
- For contributors
You configure an entity in YAML — traits, voice quirks, backstory, drives, platforms — and Bumblebee runs it as a persistent being across CLI, Telegram, and Discord. The entity develops opinions, relationships, and habits over time. It remembers everything. It costs nothing to run.
Architecture
Five pillars, one design question: how does this entity exist more fully?Cognition
Cognition
A phased perceive pipeline decomposes each turn into discrete stages: input processing, memory retrieval, prompt assembly, context compaction, a bounded agent loop with parallel tool execution, and reply delivery. Both reflex and deliberate reasoning profiles share the same tool registry and model weights.Read more →
Soma — body state and GEN
Soma — body state and GEN
A tonic body state engine provides continuous internal experience independent of conversation. Three layers: quantitative drive bars with decay and momentum (plus impulses and conflicts, including near-threshold and brewing strain), layered LLM-derived affects (surface, undercurrents, optional edge blends), and Generative Entropic Noise (GEN) — a second model producing raw associative inner voice at high temperature between turns.Ebb scales how much body + GEN appears in each turn’s prompt from a salience score (quiet / normal / high), while state keeps updating in the background —
body.md stays full detail.The entity reads its own body. It cannot control it. The body is a signal, not a command.Read more →Identity
Identity
A layered personality engine composes a first-person system prompt from core traits, behavioral patterns, voice configuration, and backstory. Trait evolution applies small adjustments over many interactions so character drifts naturally through experience.Read more →
Memory
Memory
Episodic narratives, per-person relationship models, world beliefs, emotional imprints, and self-narrative synthesis. Memory reads like biography, not chat logs. SQLite locally, Postgres for hybrid deployments.Read more →
Presence
Presence
An always-on daemon drives body state, memory consolidation, proactive initiative, and scheduled automations across CLI, Telegram, and Discord simultaneously. The same entity persists everywhere you wire it.Read more →
Inference
Local by default
Purpose-built for the Gemma 4 family running through Ollama. The default stack usesgemma4:26b for both reflex and deliberate reasoning and nomic-embed-text for vector memory. No external API calls unless you explicitly configure them—so your baseline is not a metered hosted chat API.Hybrid option
Keep inference on your home GPU behind a gateway and Cloudflare Tunnel. An always-on worker runs on Railway with Postgres — persistent, reachable, and fully isolated from third-party providers.Optional hosted evaluation (testing)
The project stays local-first and Apache 2.0—there is no “cloud edition.” If you want to stress-test the harness as a product—same cognition, memory, tools, and platforms—against frontier hosted models, you can opt in to OpenRouter or Venice AI (OpenAI-compatible APIs, Bearer keys). That path is documented in Hosted inference (testing); it is an evaluation lever, not a divergence from open-source ethos.Tools and extensibility
60+ native tools
Web search, shell, filesystem, code execution, voice synthesis, browser automation, messaging, reminders, automations, and more. Toggle categories in config; optional extras install via pip.MCP
Attach external tools via Model Context Protocol. Declare stdio servers in entity YAML — tools register dynamically at startup alongside native ones. Zapier, GitHub, and anything that speaks MCP.See the complete tool reference for every built-in tool.
Get started
Quickstart
Install to first conversation in five minutes.
Setup & onboarding
Guided
bumblebee setup, hybrid stack, tunnel, and Railway.Create an entity
Define personality, voice, and drives in YAML.
Telegram
Connect to a bot.
Hybrid deploy
Home brain, cloud worker.
CLI reference
Every command and flag.