Create and run Scala and Java JARs on serverless compute

Beta

Serverless Scala and Java jobs are in Beta. You can use JAR tasks to deploy your JAR. See Manage Databricks previews if it's not already enabled.

A Java archive (JAR) packages Java or Scala code into a single file. This article shows you how to create a JAR with Spark code and deploy it as a Lakeflow Job on serverless compute.

tip

For automated deployment and continuous integration workflows, use Databricks Asset Bundles to create a project from a template with pre-configured build and deployment settings. See Build a Scala JAR using Databricks Asset Bundles and Bundle that uploads a JAR file to Unity Catalog. This article describes the manual approach for deployments or learning how JARs work with serverless compute.

Requirements

Your local development environment must have the following:

sbt 1.11.7 or higher (for Scala JARs)
Maven 3.9.0 or higher (for Java JARs)
JDK, Scala, and Databricks Connect versions that match your serverless environment (this example uses JDK 17, Scala 2.13.16, and Databricks Connect 17.0.1)

Step 1. Build a JAR

Scala
Java

Run the following command to create a new Scala project:
Bash
```
sbt new scala/scala-seed.g8
```
When prompted, enter a project name, for example, my-spark-app.

Replace the contents of your build.sbt file with the following:

scalaVersion := "2.13.16"
libraryDependencies += "com.databricks" %% "databricks-connect" % "17.0.1"
// other dependencies go here...

// to run with new jvm options, a fork is required otherwise it uses same options as sbt process
fork := true
javaOptions += "--add-opens=java.base/java.nio=ALL-UNNAMED"

Edit or create a project/assembly.sbt file, and add this line:
```
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "2.3.1")
```

Create your main class in src/main/scala/example/DatabricksExample.scala:

Scala
package com.examples

import org.apache.spark.sql.SparkSession

object SparkJar {
  def main(args: Array[String]): Unit = {
    val spark = SparkSession.builder().getOrCreate()

    // Prints the arguments to the class, which
    // are job parameters when run as a job:
    println(args.mkString(", "))

    // Shows using spark:
    println(spark.version)
    println(spark.range(10).limit(3).collect().mkString(" "))
  }
}

To build your JAR file, run the following command:
Bash
```
sbt assembly
```

Run the following commands to create a new Maven project structure:

Bash
# Create all directories at once
mkdir -p my-spark-app/src/main/java/com/examples

cd my-spark-app

Create a pom.xml file in the project root with the following contents:

XML
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
         http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.examples</groupId>
  <artifactId>my-spark-app</artifactId>
  <version>1.0-SNAPSHOT</version>

  <properties>
    <maven.compiler.source>17</maven.compiler.source>
    <maven.compiler.target>17</maven.compiler.target>
    <scala.binary.version>2.13</scala.binary.version>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <!-- Databricks Connect for Spark -->
    <dependency>
      <groupId>com.databricks</groupId>
      <artifactId>databricks-connect_${scala.binary.version}</artifactId>
      <version>17.0.1</version>
    </dependency>
  </dependencies>

  <build>
    <plugins>
      <!-- Maven Shade Plugin - Creates fat JAR with all dependencies -->
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <version>3.6.1</version>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>shade</goal>
            </goals>
            <configuration>
              <transformers>
                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                  <mainClass>com.examples.SparkJar</mainClass>
                </transformer>
              </transformers>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

Create your main class in src/main/java/com/examples/SparkJar.java:

Java
package com.examples;

import org.apache.spark.sql.SparkSession;
import java.util.stream.Collectors;

public class SparkJar {
  public static void main(String[] args) {
    SparkSession spark = SparkSession.builder().getOrCreate();

    // Prints the arguments to the class, which
    // are job parameters when run as a job:
    System.out.println(String.join(", ", args));

    // Shows using spark:
    System.out.println(spark.version());
    System.out.println(
      spark.range(10).limit(3).collectAsList().stream()
        .map(Object::toString)
        .collect(Collectors.joining(" "))
    );
  }
}

To build your JAR file, run the following command:
Bash
```
mvn clean package
```
The compiled JAR is located in the target/ directory as my-spark-app-1.0-SNAPSHOT.jar.

Step 2. Create a job to run the JAR

In your workspace, click Jobs & Pipelines in the sidebar.
Click Create, then Job.

The Tasks tab displays with the empty task pane.

note
If the Lakeflow Jobs UI is ON, click the JAR tile to configure the first task. If the JAR tile is not available, click Add another task type and search for JAR.
Optionally, replace the name of the job, which defaults to New Job <date-time>, with your job name.
In Task name, enter a name for the task, for example JAR_example.
If necessary, select JAR from the Type drop-down menu.
For Main class, enter the package and class of your Jar. If you followed the example above, enter com.examples.SparkJar.
For Compute, select Serverless.
Configure the serverless environment:
1. Choose an environment, then click Edit to configure it.
2. Select 4-scala-preview for the Environment version.
3. Add your JAR file by dragging and dropping it into the file selector, or browse to select it from a Unity Catalog volume or workspace location.
For Parameters, for this example, enter ["Hello", "World!"].
Click Create task.

Step 3: Run the job and view the job run details

Click to run the workflow. To view details for the run, click View run in the Triggered run pop-up or click the link in the Start time column for the run in the job runs view.

When the run completes, the output displays in the Output panel, including the arguments passed to the task.

Next steps

To learn more about JAR tasks, see JAR task for jobs.
To learn more about creating a compatible JAR, see Create a Databricks compatible JAR.
To learn more about creating and running jobs, see Lakeflow Jobs.

Requirements​

Step 1. Build a JAR​

Step 2. Create a job to run the JAR​

Step 3: Run the job and view the job run details​

Next steps​

Requirements

Step 1. Build a JAR

Step 2. Create a job to run the JAR

Step 3: Run the job and view the job run details

Next steps