Broker Configuration Reference

Complete reference for danube_broker.yml configuration options. This guide explains each parameter, its impact, and when to adjust it.

Basic Configuration

Cluster Identity

cluster_name: "MY_CLUSTER"

cluster_name

Type: String
Default: None (required)
Impact: Identifies your Danube cluster in metrics, logs, and multi-cluster setups
When to change: Set a meaningful name for production environments (e.g., production-us-east, staging-cluster)

Broker Services

broker:
  host: "0.0.0.0"
  ports:
    client: 6650      # Producer/consumer connections
    admin: 50051      # Admin API (gRPC)
    raft: 7650        # Raft inter-node gRPC transport
    prometheus: 9040  # Metrics exporter
  # Optional: advertised addresses for proxy/k8s mode
  # advertised_listeners:
  #   broker_url: "broker-0.danube-broker-headless:6650"
  #   connect_url: "danube-proxy.example.com:6650"

broker.host

Type: String (IP address or hostname)
Default: 0.0.0.0 (all interfaces)
Impact: Controls which network interfaces the broker binds to
When to change:
Use 0.0.0.0 for containers or multi-interface servers
Use specific IP for security/isolation
Use 127.0.0.1 for local-only testing

broker.ports.client

Type: Integer
Default: 6650
Impact: Port where producers and consumers connect
When to change: Resolve port conflicts or follow network policies

broker.ports.admin

Type: Integer
Default: 50051
Impact: gRPC port for admin operations (topic creation, subscriptions, etc.)
When to change: Avoid conflicts with other gRPC services

broker.ports.raft

Type: Integer
Default: 7650
Impact: Raft inter-node gRPC transport port for consensus and metadata replication
When to change: When running multiple brokers on the same host (use unique ports per broker)

broker.ports.prometheus

Type: Integer (optional)
Default: 9040
Impact: Prometheus metrics scraping endpoint
When to change: Set to null to disable metrics exporter

Advertised Listeners

broker.advertised_listeners.broker_url

Type: String (optional)
Impact: Address reachable inside the cluster (broker identity, inter-broker communication)
When to set: Kubernetes (headless service DNS) or proxy deployments where the bind address differs from the routable address
If omitted: Defaults to host:ports.client

broker.advertised_listeners.connect_url

Type: String (optional)
Impact: Address reachable by external clients via gRPC proxy
When to set: When clients connect through a load balancer or ingress
If omitted: Defaults to broker_url

Metadata Store (Embedded Raft)

Danube uses embedded Raft consensus for metadata replication — no external dependency (no ETCD, no ZooKeeper). The node_id is auto-generated on first boot and persisted in {data_dir}/node_id.

meta_store:
  data_dir: "./danube-data/raft"
  # seed_nodes:
  #   - "0.0.0.0:7650"
  #   - "0.0.0.0:7651"
  #   - "0.0.0.0:7652"

meta_store.data_dir

Type: String (path)
Default: ./danube-data/raft
Impact: Directory for Raft log, snapshots, and node identity
When to change: Point to durable storage; ensure each broker has its own directory

meta_store.seed_nodes

Type: Array of strings (optional)
Default: Empty / omitted (single-node cluster, auto-init)
Impact: List of Raft transport addresses (host:raft_port) for initial cluster formation
When to set:
Single-node: Omit entirely — the broker auto-initializes a single-node Raft cluster
Multi-node: List ALL initial peers (including this broker's own host:raft_port)
Example (3-node cluster):

seed_nodes:
  - "broker1:7650"
  - "broker2:7651"
  - "broker3:7652"

Bootstrap Configuration

bootstrap_namespaces:
  - "default"

auto_create_topics: true

# admin_tls: false

bootstrap_namespaces

Type: Array of strings
Default: ["default"]
Impact: Namespaces created automatically on broker startup
When to change: Pre-create namespaces for multi-tenant environments (e.g., ["default", "team-a", "team-b"])

auto_create_topics

Type: Boolean
Default: true
Impact: Whether producers can create topics automatically when publishing
When to change:
true: Development, rapid prototyping
false: Production with controlled topic creation via Admin API

admin_tls

Type: Boolean (optional)
Default: false
Impact: Enable TLS on the admin gRPC API. When true, reuses the same cert_file / key_file from auth.tls
When to enable: Remote cluster management over untrusted networks

Security Configuration

auth:
  mode: tls  # Options: none | tls | tlswithjwt
  tls:
    cert_file: "./cert/server-cert.pem"
    key_file: "./cert/server-key.pem"
    ca_file: "./cert/ca-cert.pem"
    verify_client: false
  jwt:
    secret_key: "your-secret-key"
    issuer: "danube-auth"
    expiration_time: 3600

Authentication Modes

Mode	Description	Use Case
`none`	No authentication	Local development, trusted networks
`tls`	Mutual TLS (mTLS)	Secure production, service-to-service
`tlswithjwt`	TLS + JWT tokens	Multi-tenant, user-level auth

auth.mode

Type: String enum
Default: none
Impact: Security layer for client connections
When to change: Always use tls or tlswithjwt in production

TLS Settings

auth.tls.cert_file

Type: File path
Impact: Server certificate (public key)
When to change: Use valid certificates from your PKI/CA

auth.tls.key_file

Type: File path
Impact: Server private key
When to change: Keep secure, rotate periodically

auth.tls.ca_file

Type: File path
Impact: Certificate authority for validating client certs
When to change: Match your organization's CA

auth.tls.verify_client

Type: Boolean
Default: false
Impact: Require client certificates (mutual TLS)
When to change:
true: Maximum security, requires client certs
false: Server-only TLS, simpler client setup

JWT Settings

auth.jwt.secret_key

Type: String
Impact: Shared secret for signing/validating JWT tokens
When to change: Use strong random key, rotate regularly, store in secrets manager

auth.jwt.issuer

Type: String
Default: danube-auth
Impact: JWT issuer claim for validation
When to change: Match your auth service's issuer

auth.jwt.expiration_time

Type: Integer (seconds)
Default: 3600 (1 hour)
Impact: Token validity duration
When to change:
Shorter: More secure, requires frequent renewal
Longer: Less overhead, wider security window

Load Manager Configuration

The Load Manager handles topic assignment and automated rebalancing. See Load Manager Architecture for concepts.

load_manager:
  assignment_strategy: "fair"
  load_report_interval_seconds: 30
  rebalancing:
    enabled: false
    aggressiveness: "balanced"
    check_interval_seconds: 300
    max_moves_per_hour: 10
    cooldown_seconds: 60
    min_brokers_for_rebalance: 2
    min_topic_age_seconds: 300
    blacklist_topics: []

Topic Assignment Strategy

Controls how NEW topics are assigned to brokers.

Strategy	Algorithm	CPU Overhead	Best For
`fair`	Topic count only	Lowest	Development, testing
`balanced`	Multi-factor (topic load + CPU + memory)	Medium	Production (recommended)
`weighted_load`	Adaptive bottleneck detection	Highest	Variable workloads

assignment_strategy

Type: String enum
Default: fair
Impact: How new topics are placed across brokers
When to change:
fair: Simple setups, predictable placement
balanced: General production (recommended)
weighted_load: Clusters with highly variable workloads

Formula for balanced strategy:

score = (weighted_topic_load × 0.3) + (CPU% × 0.35) + (Memory% × 0.35)

Load Report Interval

load_report_interval_seconds

Type: Integer (seconds)
Default: 30
Impact: How often brokers publish resource metrics to the metadata store
When to change:
5-10: Testing, rapid response to changes (higher metadata traffic)
30-60: Production, balanced overhead
>60: Low-traffic clusters, reduce metadata store load

Automated Rebalancing

Proactively moves topics between brokers to maintain cluster balance.

rebalancing.enabled

Type: Boolean
Default: false (disabled for safety)
Impact: Enables automatic topic movement
When to enable: After cluster is stable, you understand baseline behavior
Caution: Start with conservative aggressiveness

Aggressiveness Levels

Controls how aggressively the system optimizes cluster balance.

Level	CV Threshold	Check Interval	Max Moves/Hour	Cooldown	Use Case
`conservative`	>40%	600s (10m)	5	120s	Stable production, risk-averse
`balanced`	>30%	300s (5m)	10	60s	General production
`aggressive`	>20%	180s (3m)	20	30s	Dynamic workloads, testing

CV (Coefficient of Variation) measures cluster imbalance:

<20%: Excellent balance
20-30%: Good balance
30-40%: Moderate imbalance
>40%: Significant imbalance

rebalancing.aggressiveness

Type: String enum
Default: balanced
Impact: Sets default values for check interval, thresholds, and move limits
When to change:
conservative: Production, minimize disruption
balanced: Most production clusters
aggressive: Testing, rapid optimization

Rebalancing Knobs

rebalancing.check_interval_seconds

Type: Integer
Default: Varies by aggressiveness
Impact: How often to evaluate cluster balance
When to change: Override aggressiveness defaults for fine-tuning

rebalancing.max_moves_per_hour

Type: Integer
Default: Varies by aggressiveness
Impact: Rate limit for topic moves (prevents storms)
When to change: Increase for large rebalancing operations, decrease for stability

rebalancing.cooldown_seconds

Type: Integer
Default: Varies by aggressiveness
Impact: Minimum time between consecutive topic moves
When to change: Increase to slow down rebalancing, decrease for faster convergence

rebalancing.min_brokers_for_rebalance

Type: Integer
Default: 2
Impact: Rebalancing skipped if cluster has fewer brokers
When to change: Rarely (single-broker clusters don't need rebalancing)

rebalancing.min_topic_age_seconds

Type: Integer
Default: 300 (5 minutes)
Impact: Don't move topics younger than this
When to change: Increase to protect recently created/moved topics

Topic Blacklist

rebalancing.blacklist_topics

Type: Array of strings (patterns)
Default: [] (empty)
Impact: Topics matching these patterns will NEVER be rebalanced
Patterns:
Exact: /admin/critical-topic
Namespace wildcard: /system/*
When to use: Protect critical topics, pin topics to specific brokers

Example:

blacklist_topics:
  - "/system/*"              # All system topics
  - "/admin/critical-topic"  # Specific critical topic
  - "/production/high-priority/*"

WAL and Cloud Storage Configuration

Cloud-native persistence: local WAL (fast writes) + background cloud uploads (durability). See Persistence Architecture for details.

wal_cloud:
  wal: { ... }
  uploader: { ... }
  cloud: { ... }

Local Write-Ahead Log (WAL)

wal:
  dir: "./danube-data/wal"
  file_name: "wal.log"
  cache_capacity: 1024
  file_sync:
    interval_ms: 5000
    max_batch_bytes: 10485760
  rotation:
    max_bytes: 536870912
    # max_hours: 24
  retention:
    time_minutes: 2880
    size_mb: 20480
    check_interval_minutes: 5

wal.dir

Type: String (path) or null
Default: ./danube-data/wal
Impact: Root directory for WAL files (per-topic subdirectories created automatically)
When to change:
Point to fast SSD/NVMe for best performance
Set to null for in-memory mode (testing only, no durability)

wal.file_name

Type: String
Default: wal.log
Impact: Base filename (rotation creates wal.0.log, wal.1.log, etc.)
When to change: Rarely needed

wal.cache_capacity

Type: Integer (number of messages)
Default: 1024
Impact: In-memory replay cache for consumer reads
When to change:
Higher (4096+): Better consumer hit rates, more memory usage
Lower (512): Less memory, more disk reads for catch-up consumers

File Sync (Durability)

wal.file_sync.interval_ms

Type: Integer (milliseconds)
Default: 5000 (5 seconds)
Impact: How often buffered writes are fsynced to disk
Trade-offs:

Value	Checkpoint Freshness	Fsync Pressure	Data Loss Window
1000 (1s)	Very fresh	High	1 second
5000 (5s)	Fresh	Balanced	5 seconds
10000 (10s)	Stale	Low	10 seconds

When to change:

Low-latency: 1000 (1s)
Production: 5000 (5s) - recommended
High-throughput: 10000 (10s)

wal.file_sync.max_batch_bytes

Type: Integer (bytes)
Default: 10485760 (10 MiB)
Impact: Force flush when buffer reaches this size
When to change:
Increase for high-volume topics (better throughput)
Decrease for lower latency (more frequent fsyncs)

File Rotation

wal.rotation.max_bytes

Type: Integer (bytes)
Default: 536870912 (512 MiB)
Impact: Create new WAL file when current file exceeds this size
When to change:
512 MiB: Balanced (recommended)
1 GiB: Fewer files, larger cloud objects
256 MiB: More files, smaller cloud objects

wal.rotation.max_hours

Type: Integer (hours, optional)
Default: Disabled (commented out)
Impact: Rotate even if size threshold not reached
When to enable: Good for low-traffic topics to prevent infinitely old files
Recommended: 24 hours for production

WAL Retention (Local Cleanup)

wal.retention.time_minutes

Type: Integer (minutes)
Default: 2880 (48 hours)
Impact: Delete local WAL files older than this (after cloud upload)
When to change:
Production: 2880 (48h) for safety margin
Space-constrained: 1440 (24h) minimum
Must be larger than uploader interval

wal.retention.size_mb

Type: Integer (megabytes per topic)
Default: 20480 (20 GiB)
Impact: Delete oldest files when total exceeds this
When to change:
High-volume: 20480 (20 GiB) or higher
Space-constrained: 10240 (10 GiB) minimum

wal.retention.check_interval_minutes

Type: Integer (minutes)
Default: 5
Impact: How often retention policy runs
When to change: Rarely (5 minutes is responsive without overhead)

Cloud Uploader

uploader:
  enabled: true
  interval_seconds: 30
  root_prefix: "/danube-data"
  max_object_mb: 256

uploader.enabled

Type: Boolean
Default: true
Impact: Enable/disable background cloud uploads
When to disable: Local-only testing, ephemeral workloads
When to enable: Production (always)

uploader.interval_seconds

Type: Integer (seconds)
Default: 30
Impact: Upload cycle frequency

Value	RPO (Recovery Point)	Cloud Overhead	Use Case
30s	30 seconds	High	Testing, fast feedback
60s	1 minute	Medium-High	High-durability production
300s	5 minutes	Balanced	General production
600s	10 minutes	Low	Cost-sensitive

uploader.root_prefix

Type: String
Default: /danube-data
Impact: Metadata prefix for cloud object descriptors in the Raft metadata store
When to change: Only for multiple independent Danube clusters sharing a metadata namespace

uploader.max_object_mb

Type: Integer (megabytes, optional)
Default: 256
Impact: Maximum size per cloud object
When to change:
256 MB: Optimal for S3/GCS multipart uploads (recommended)
Larger: Fewer objects, higher per-object latency
Smaller: More objects, better parallelism

Cloud Storage Backend

cloud:
  backend: "fs"
  root: "./danube-data/cloud-storage"
  # Backend-specific options below

cloud.backend

Type: String enum
Options: fs | s3 | gcs | azblob | memory
Default: fs
Impact: Cloud storage provider

Backend	Use Case	Multi-Broker Support
`memory`	Testing only (no durability)	No
`fs`	Local development, shared NFS/EFS	Yes (with shared storage)
`s3`	Production (AWS, MinIO, etc.)	Yes
`gcs`	Production (Google Cloud)	Yes
`azblob`	Production (Azure)	Yes

cloud.root

Type: String (backend-specific format)
Impact: Storage location
Formats:
fs: ./path/to/directory
s3: s3://bucket-name/prefix
gcs: gcs://bucket-name/prefix
azblob: container-name/prefix
memory: namespace-prefix
Multi-broker requirement: Must be shared/accessible across all brokers

Backend-Specific Options

Amazon S3:

cloud:
  backend: "s3"
  root: "s3://my-bucket/danube-data"
  region: "us-east-1"
  endpoint: "https://s3.us-east-1.amazonaws.com"  # Optional
  access_key: "${AWS_ACCESS_KEY_ID}"              # Or use IAM roles
  secret_key: "${AWS_SECRET_ACCESS_KEY}"
  anonymous: false

Google Cloud Storage:

cloud:
  backend: "gcs"
  root: "gcs://my-bucket/danube-data"
  project: "my-gcp-project"
  credentials_path: "/path/to/service-account.json"
  # Or credentials_json: "{ ... }"  # Inline JSON string

Azure Blob Storage:

cloud:
  backend: "azblob"
  root: "my-container/danube-data"
  endpoint: "https://myaccount.blob.core.windows.net"
  account_name: "myaccount"
  account_key: "${AZURE_STORAGE_KEY}"

Local Filesystem:

cloud:
  backend: "fs"
  root: "./danube-data/cloud-storage"
  # For multi-broker: use shared storage like NFS
  # root: "/mnt/shared-nfs/danube-cloud"

Broker Policies

Default resource limits for topics. Can be overridden at namespace or topic level.

policies:
  max_producers_per_topic: 0
  max_subscriptions_per_topic: 0
  max_consumers_per_topic: 0
  max_consumers_per_subscription: 0
  max_publish_rate: 0
  max_subscription_dispatch_rate: 0
  max_message_size: 10485760  # 10 MB

Default value 0 means unlimited for all rate/count limits.

Connection Limits

max_producers_per_topic

Type: Integer
Default: 0 (unlimited)
Impact: Maximum concurrent producers per topic
When to set: Prevent resource exhaustion from producer storms

max_subscriptions_per_topic

Type: Integer
Default: 0 (unlimited)
Impact: Maximum subscriptions per topic
When to set: Control fan-out complexity

max_consumers_per_topic

Type: Integer
Default: 0 (unlimited)
Impact: Maximum concurrent consumers across all subscriptions
When to set: Limit total consumer connections

max_consumers_per_subscription

Type: Integer
Default: 0 (unlimited)
Impact: Maximum consumers sharing a single subscription
When to set: Control load-sharing fan-out

Rate Limits

max_publish_rate

Type: Integer (messages/second or bytes/second)
Default: 0 (unlimited)
Impact: Throttle producer publish rate
When to set: Prevent producer overwhelming broker

max_subscription_dispatch_rate

Type: Integer (messages/second)
Default: 0 (unlimited)
Impact: Throttle consumer dispatch rate
When to set: Protect slow consumers

Message Size

max_message_size

Type: Integer (bytes)
Default: 10485760 (10 MB)
Impact: Maximum single message size
When to change:
Increase: Large payloads (video, logs)
Decrease: Prevent memory issues from huge messages
Note: Very large messages impact performance