Skip to main content

Create a Jira ingestion pipeline

Beta

The Jira connector is in Beta.

This page describes how to create a Jira ingestion pipeline using Databricks Lakeflow Connect. You can ingest Jira data using either a notebook, Databricks Asset Bundles, or the Databricks CLI.

Before you begin

To create an ingestion pipeline, you must meet the following requirements:

  • Your workspace must be enabled for Unity Catalog.

  • Serverless compute must be enabled for your workspace. See Serverless compute requirements.

  • If you plan to create a new connection: You must have CREATE CONNECTION privileges on the metastore.

    If the connector supports UI-based pipeline authoring, an admin can create the connection and the pipeline at the same time by completing the steps on this page. However, if the users who create pipelines use API-based pipeline authoring or are non-admin users, an admin must first create the connection in Catalog Explorer. See Connect to managed ingestion sources.

  • If you plan to use an existing connection: You must have USE CONNECTION privileges or ALL PRIVILEGES on the connection object.

  • You must have USE CATALOG privileges on the target catalog.

  • You must have USE SCHEMA and CREATE TABLE privileges on an existing schema or CREATE SCHEMA privileges on the target catalog.

To configure Jira for ingestion, see Configure Jira for ingestion.

Create an ingestion pipeline

Permissions required: USE CONNECTION or ALL PRIVILEGES on a connection.

You can ingest Jira data using either a notebook, Databricks Asset Bundles, or the Databricks CLI. Each table you specify is ingested into a streaming table or a snapshot table, depending on the source. See Jira connector reference for the full list of objects that are available to ingest.

  1. Copy and paste the following code into a notebook cell and run the code. Do not modify any of this code.

    Python
    # DO NOT MODIFY


    # This sets up the API utils for creating managed ingestion pipelines in Databricks.


    import requests
    import json


    notebook_context = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
    api_token = notebook_context.apiToken().get()
    workspace_url = notebook_context.apiUrl().get()
    api_url = f"{workspace_url}/api/2.0/pipelines"


    headers = {
    'Authorization': 'Bearer {}'.format(api_token),
    'Content-Type': 'application/json'
    }


    def check_response(response):
    if response.status_code == 200:
    print("Response from API:\n{}".format(json.dumps(response.json(), indent=2, sort_keys=False)))
    else:
    print(f"Failed to retrieve data: error_code={response.status_code}, error_message={response.json().get('message', response.text)}")




    def create_pipeline(pipeline_definition: str):
    response = requests.post(url=api_url, headers=headers, data=pipeline_definition)
    check_response(response)




    def edit_pipeline(id: str, pipeline_definition: str):
    response = requests.put(url=f"{api_url}/{id}", headers=headers, data=pipeline_definition)
    check_response(response)




    def delete_pipeline(id: str):
    response = requests.delete(url=f"{api_url}/{id}", headers=headers)
    check_response(response)




    def list_pipeline(filter: str):
    body = "" if len(filter) == 0 else f"""{{"filter": "{filter}"}}"""
    response = requests.get(url=api_url, headers=headers, data=body)
    check_response(response)




    def get_pipeline(id: str):
    response = requests.get(url=f"{api_url}/{id}", headers=headers)
    check_response(response)




    def start_pipeline(id: str, full_refresh: bool=False):
    body = f"""
    {{
    "full_refresh": {str(full_refresh).lower()},
    "validate_only": false,
    "cause": "API_CALL"
    }}
    """
    response = requests.post(url=f"{api_url}/{id}/updates", headers=headers, data=body)
    check_response(response)




    def stop_pipeline(id: str):
    print("cannot stop pipeline")
  2. Modify the following pipeline specifications template to fit your ingestion needs. Then, run the cell and an ingestion pipeline has been created. You can view the pipeline in the Jobs & Pipelines section in your workspace.

    You can also optionally filter the data by Jira spaces or projects. Ensure you are using exact project keys instead of project names or IDs.

    (Recommended) The following is an example of ingesting a single source table. See Jira connector reference for a full list of source tables you can ingest.

    Python
    # Example of ingesting a single table
    pipeline_spec = """
    {
    "name": "<YOUR_PIPELINE_NAME>",
    "ingestion_definition": {
    "connection_name": "<YOUR_CONNECTION_NAME>",
    "objects": [
    {
    "table": {
    "source_schema": "default",
    "source_table": "issues",
    "destination_catalog": "<YOUR_CATALOG>",
    "destination_schema": "<YOUR_SCHEMA>",
    "destination_table": "jira_issues",
    "jira_options": {
    "include_jira_spaces": ["key1", "key2"]
    }
    },
    "scd_type": "SCD_TYPE_1"
    }
    ]
    },
    "channel": "PREVIEW"
    }
    """

    create_pipeline(pipeline_spec)

    (Recommended) The following is an example of ingesting multiple source tables. See Jira connector reference for a full list of source tables you can ingest.

    Python
    # Example of ingesting multiple tables
    pipeline_spec = """
    {
    "name": "<YOUR_PIPELINE_NAME>",
    "ingestion_definition": {
    "connection_name": "<YOUR_CONNECTION_NAME>",
    "objects": [
    {
    "table": {
    "source_schema": "default",
    "source_table": "issues",
    "destination_catalog": "<YOUR_CATALOG>",
    "destination_schema": "<YOUR_SCHEMA>",
    "destination_table": "jira_issues",
    "jira_options": {
    "include_jira_spaces": ["key1", "key2"]
    }
    }
    },
    {
    "table": {
    "source_schema": "default",
    "source_table": "projects",
    "destination_catalog": "<YOUR_CATALOG>",
    "destination_schema": "<YOUR_SCHEMA>",
    "destination_table": "jira_projects",
    "jira_options": {
    "include_jira_spaces": ["key1", "key2"]
    }
    }
    }
    ]
    },
    "channel": "PREVIEW"
    }
    """

    create_pipeline(pipeline_spec)

    The following is an example of ingesting all available Jira source tables in one pipeline. Ensure that your OAuth application includes all scopes required by the full table set and that the authenticating user has the necessary Jira permissions. Pipelines fail if any required scope or permission is missing.

    Python
    # Example of ingesting all source tables

    pipeline_spec = """
    {
    "name": "<YOUR_PIPELINE_NAME>",
    "ingestion_definition": {
    "connection_name": "<YOUR_CONNECTION_NAME>",
    "objects": [
    {
    "schema": {
    "source_schema": "default",
    "destination_catalog": "<YOUR_CATALOG>",
    "destination_schema": "<YOUR_SCHEMA>",
    "jira_options": {
    "include_jira_spaces": ["key1", "key2"]
    }
    },
    "scd_type": "SCD_TYPE_1"
    }
    ]
    },
    "channel": "PREVIEW"
    }
    """

    create_pipeline(pipeline_spec)

Additional resources