%md
# **RunC/C++codeonDatabricks**
This notebook shows how to compile C/C++ code and run it on a Spark cluster in Databricks.
Run C/C++ code on Databricks
This notebook shows how to compile C/C++ code and run it on a Spark cluster in Databricks.
Last refresh: Never
%md
### Setup:Write/CopyC/C++codetoDBFS.
Write or copy your code to DBFS, so that later your code can be copied onto the Spark driver and compiled there.
For this simple example, the program could have just been written directly to the local disk of the Spark Driver, but copying to DBFS first makes more sense if you have a large number of C/C++ files.
Setup: Write/Copy C/C++ code to DBFS.
Write or copy your code to DBFS, so that later your code can be copied onto the Spark driver and compiled there.
For this simple example, the program could have just been written directly to the local disk of the Spark Driver, but copying to DBFS first makes more sense if you have a large number of C/C++ files.
Last refresh: Never
# This is a very simple test programdbutils.fs.put("dbfs:/tmp/simple.c",
"""#include <stdio.h>int main (int argc, char *argv[]) { char str[100]; while (1) { if (!fgets(str, 100, stdin)) { return 0; } printf("Hello, %s", str); }}""", True)
Wrote 182 bytes.
Out[1]: True
Command took0.59 seconds
# Verify the program was written over correctly.printdbutils.fs.head("dbfs:/tmp/simple.c")
#include <stdio.h>
int main (int argc, char *argv[]) {
char str[100];
while (1) {
if (!fgets(str, 100, stdin)) {
return 0;
}
printf("Hello, %s", str);
}
}
Step 1: Compile the C/C++ code for the Spark machines.
Last refresh: Never
# Copy the file to the local disk of the Spark driver, so it can be compiled.dbutils.fs.cp("dbfs:/tmp/simple.c", "file:/tmp/simple.c")
Out[3]: True
Command took0.18 seconds
# Delete any previously existing binary if it exists.dbutils.fs.rm("file:/tmp/simple")
Out[4]: False
Command took0.03 seconds
# Compile the C/C++ code to a binary.importosos.system("/usr/bin/gcc -o /tmp/simple /tmp/simple.c")
Out[5]: 0
Command took0.14 seconds
# Check for the binary.display(dbutils.fs.ls("file:/tmp/simple"))
file:/tmp/simple
simple
8720
path
name
size
Last refresh: Never
Command took0.91 seconds
# Copy the binary to DBFS, so it will be accessible to all Spark worker nodes.dbutils.fs.cp("file:/tmp/simple", "dbfs:/tmp/simple")
Out[7]: True
Command took0.33 seconds
%md
### Step2:WritethebinarytoalltheSparkworkernodes.
Alternately, you could use init scripts to do this as well, but you'll have to call the DBFS library directly.
Step 2: Write the binary to all the Spark worker nodes.
Alternately, you could use init scripts to do this as well, but you'll have to call the DBFS library directly.
Run C/C++ code on Databricks
This notebook shows how to compile C/C++ code and run it on a Spark cluster in Databricks.
Last refresh: Never