Skip to main content

OpenTelemetry table reference for Zerobus Ingest

Beta

This feature is in Beta.

This page provides reference information for the OpenTelemetry (OTLP) table schemas and data mapping used by Zerobus Ingest OTLP.

Table schema

When OTLP data arrives, Zerobus Ingest converts each record from the nested OTLP resource/scope/record hierarchy into a flat, denormalized row. Resource attributes and instrumentation scope information are embedded directly in each row, making the data immediately queryable without joins.

All attribute fields (attributes, resource.attributes, instrumentation_scope.attributes, body for logs, metadata for metrics) are stored as VARIANT columns. VARIANT is a semi-structured type in Delta Lake that stores JSON data while preserving the original types.

Each record is augmented with Databricks-specific fields:

Field

Description

Source

record_id

A system-generated ID for unique identification and time-ordered sorting.

Generated based on time

time

Timestamp in microseconds from the Unix epoch.

Timestamp (in microseconds) derived from start_time_unix_nano (spans) or time_unix_nano (logs, metrics)

date

Date partition column, for efficient time-range filtering.

Derived from time

service_name

Top-level column for efficient filtering by service name, as defined in the OTel semantic convention.

Extracted from resource.attributes["service.name"]

Schema mapping

Zerobus Ingest maps OTLP data to Delta table columns as described below.

Denormalization

In the OTLP protocol, telemetry data is nested like so.

ResourceSpans (or ResourceLogs, ResourceMetrics)
└── Resource (attributes, schema_url)
└── ScopeSpans (or ScopeLogs, ScopeMetrics)
└── InstrumentationScope (name, version, attributes)
└── Span (or LogRecord, Metric)

Zerobus Ingest flattens this hierarchy so that each row contains the full context:

  • resource: A struct containing the resource attributes (as VARIANT) and dropped_attributes_count.
  • resource_schema_url: The schema URL from the enclosing ResourceSpans, ResourceLogs, or ResourceMetrics.
  • instrumentation_scope: A struct containing the scope name, version, attributes (as VARIANT), and dropped_attributes_count.
  • span_schema_url / log_schema_url / metric_schema_url: The schema URL from the enclosing ScopeSpans, ScopeLogs, or ScopeMetrics.

ID encoding

trace_id, span_id, and parent_span_id are stored as lowercase hex-encoded strings:

  • trace_id: 32-character hex string (16 bytes)
  • span_id: 16-character hex string (8 bytes)

Enum encoding

Enum values (kind, status.code, aggregation_temporality, severity_number) are stored as their string names as defined in the OTLP specification. For example: SPAN_KIND_SERVER, STATUS_CODE_OK, AGGREGATION_TEMPORALITY_DELTA.

Next steps