TIBCO Spotfire Analyst

This article describes how to use TIBCO Spotfire Analyst with Databricks.

Step 1: Get Databricks connection information

  1. Get a personal access token.
  2. Get the server hostname, port, and HTTP path.

Step 2: Configure Databricks cluster connection in TIBCO Spotfire

To connect TIBCO Spotfire Analyst to your Databricks cluster, you use the native Apache Spark SQL connector of TIBCO Spotfire.

  1. In TIBCO Spotfire Analyst, open Files and data flyout and click Connect to.

    Files and Data
  2. Select Apache Spark SQL and click New connection.

    Apache Spark SQL Connection
    1. Enter the server hostname and port of your cluster, separated by a colon.
    2. Select the authentication method Username and Password.
    3. Enter token for Username and your Databricks personal access token value for the Password.
    4. Select the correct SSL options for your cluster.
    5. On the Advanced tab, select the Thrift transport mode HTTP, and enter your cluster’s HTTP path.
    6. Click Connect.
    7. Use the drop-down menu to select the Database, and then click OK.

Step 3: Select the Databricks data to analyze

You select data in the Views in connection pane.

Available Tables
  1. Browse the tables available in the cluster.
  2. Add the tables you want as views, which will be the data tables you analyze in Spotfire.
  3. For each view, you can decide which columns you want to include. If you want create a very specific and flexible data selection, you have access to a range of powerful tools in this dialog, such as:
    • Custom queries. With custom queries, you can select the data you want to analyze by typing a custom SQL query.
    • Prompting. Leave the data selection to the user of your analysis file. You configure prompts based on columns of your choice. Then, the end user who opens the analysis can select to limit and view data for relevant values only. For example, the user can select data within a certain span of time or for a specific geographic region.
  4. Click OK.

Step 4: Push-down queries to Databricks or import data

When you have selected the data that you want to analyze, the final step is to choose how you want to retrieve the data from your Databricks cluster. A summary of the data tables you are adding to your analysis is displayed, and you can click each table to change the data loading method.

orders table example

The default option for Databricks is External. This means the data table is kept in-database in Databricks, and Spotfire pushes different queries to the database for the relevant slices of data, based on your actions in the analysis.

You can also select Imported and Spotfire will extract the entire data table up-front, which enables local in-memory analysis. When you import data tables, you also use analytical functions in the embedded in-memory data engine of Spotfire

The third option is On-demand (corresponding to a dynamic WHERE clause), which means that slices of data will be extracted based on user actions in the analysis. You can define the criteria, which could be actions like marking or filtering data, or changing document properties. On-demand data loading can also be combined with External data tables.