Skip to main content

RabbitMQ connector FAQs

Beta

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.

This page contains answers to frequently asked questions about the managed RabbitMQ ingestion connector in Databricks Lakeflow Connect.

General managed connector FAQs

See Managed connector FAQs for FAQs that apply to all Lakeflow Connect managed connectors. The following are specific to the RabbitMQ connector.

Connector-specific FAQs

Why am I getting a connection timeout when connecting to RabbitMQ?

Common causes include:

  • Wrong host or port: The RabbitMQ broker hostname or port in the Unity Catalog connection is incorrect. The default port for RabbitMQ is 5672.
  • Network connectivity: Network connectivity from the serverless compute plane to your RabbitMQ broker is blocked. Verify your VPC peering, security group, or firewall rules.

Can I read from multiple RabbitMQ queues in a single pipeline?

Yes. Define one table entry per queue in ingestion_definition.objects. Each queue is mapped 1:1 to a destination streaming table. A single pipeline uses one Unity Catalog connection (one RabbitMQ broker). To ingest from multiple brokers, create separate pipelines.

How can I monitor how far behind my stream is from the queue?

See View Streaming Metrics. Key signals for RabbitMQ:

  • Input rate and processing rate: How fast the connector consumes from the queue.
  • Queue depth on the broker side: Monitor this through your RabbitMQ management UI. Because RabbitMQ classic queues are queue-based and have no replayable offset, queue depth is the authoritative lag indicator.

Why is my stream returning no records even though I think data exists?

Common causes include:

  • Wrong queue name: Verify you're consuming from the correct queue.
  • Competing external consumers: RabbitMQ delivers each message to exactly one consumer. If another (non-Spark) consumer is reading from the same queue, the pipeline might receive fewer messages than expected.
  • Authentication or authorization issues: The connection might have succeeded but the user might lack permission to consume from the queue. Check your RabbitMQ permissions (read permission on the queue).
  • Messages already consumed and acknowledged: Unlike Kafka, RabbitMQ classic queues do not retain messages after acknowledgement. If another consumer acknowledged the data before the pipeline started, it cannot be replayed.

What happens to messages if the pipeline crashes before acknowledging them?

RabbitMQ guarantees at-least-once delivery. If the pipeline crashes or a consumer disconnects before sending an acknowledgement (ACK), unacknowledged messages remain in the queue and are redelivered on reconnect. The _redelivered metadata column is true for these messages.

Can the connector guarantee exactly-once delivery?

No. The managed RabbitMQ connector provides at-least-once semantics. RabbitMQ classic queues do not provide a broker-assigned unique message ID, so exact deduplication at the connector layer is not possible. If you need exactly-once processing, deduplicate downstream using an application-specific key (for example, a message_id set by your producer, or a business key inside the message body).

Does the connector support RabbitMQ Streams?

No. The managed connector supports RabbitMQ classic queues only. RabbitMQ Streams use a different, offset-based model.