Skip to main content

HubSpot connector limitations

Beta

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.

Learn about known limitations when using the managed HubSpot connector in Lakeflow Connect.

General limitations

  • When you run a scheduled pipeline, alerts don't trigger immediately. Instead, they trigger when the next update runs.
  • When a source table is deleted, the destination table is not automatically deleted. You must delete the destination table manually. This behavior is not consistent with Lakeflow Spark Declarative Pipelines behavior.
  • During source maintenance periods, Databricks might not be able to access your data.
  • If a source table name conflicts with an existing destination table name, the pipeline update fails.
  • Multi-destination pipeline support is API-only.
  • You can optionally rename a table that you ingest. If you rename a table in your pipeline, it becomes an API-only pipeline, and you can no longer edit the pipeline in the UI.
  • Column-level selection and deselection are API-only.
  • If you select a column after a pipeline has already started, the connector does not automatically backfill data for the new column. To ingest historical data, manually run a full refresh on the table.
  • Databricks can't ingest two or more tables with the same name in the same pipeline, even if they come from different source schemas.
  • The source system assumes that the cursor columns are monotonically increasing.
  • With SCD type 1 enabled, deletes don't produce an explicit delete event in the change data feed. For auditable deletions, use SCD type 2 if the connector supports it. For details, see Example: SCD type 1 and SCD type 2 processing with CDF source data.
  • The connector ingests raw data without transformations. Use downstream Lakeflow Spark Declarative Pipelines pipelines for transformations.

Supported interfaces

You cannot create HubSpot ingestion pipelines in the Databricks UI.

API rate limits

HubSpot enforces API rate limits, including a 10 second per request limit. For performance recommendations, see Pipeline runs slowly.

Long sync times for complex tables

Some tables can take a long time to ingest due to high API call requirements. For example, the marketing_campaign_asset table requires 24 separate API calls per campaign. If you have 100 campaigns, you need roughly 2,400 API calls for this single table, compared to roughly 5 API calls for simpler tables. Expect long sync times for these tables. For example, a table with 5,000 campaigns might take roughly four to six hours. For the full list of applicable tables, see Tables that support batch updates only. For performance recommendations, see Pipeline runs slowly.

Nested fields represented as strings

Some fields in the HubSpot schema are nested within complex structures, and the inner-level fields can include custom attributes. To ensure compatibility and consistency, such fields are represented as strings.

For example, the forms table has a field with displayOptions. Each form has variable display configurations, so this field is stored as a string to accommodate the varying structures.

Limited incremental support

Some tables don't support incremental updates because the HubSpot API doesn't provide a way to filter records based on a cursor. These tables are refreshed on each pipeline update.

For a list of supported tables and their update patterns, see HubSpot connector reference.