Neo4j is a native graph database that leverages data relationships as first-class entities. You can connect a cluster in Databricks to a Neo4j cluster using the neo4j-spark-connector, which offers Spark APIs for RDD, DataFrame, GraphX and GraphFrames. The neo4j-spark-connector uses the binary Bolt protocol to transfer data to and from the Neo4j server.
Neo4j can be deployed on various cloud providers.
Make sure the Neo4j password has been changed from default (you should be prompted when you first access Neo4j) and modify conf/neo4j.conf to accept remote connections. For more information see Configuring Neo4j Connectors.
# conf/neo4j.conf # Bolt connector dbms.connector.bolt.enabled=true #dbms.connector.bolt.tls_level=OPTIONAL dbms.connector.bolt.listen_address=0.0.0.0:7687 # HTTP Connector. There must be exactly one HTTP connector. dbms.connector.http.enabled=true #dbms.connector.http.listen_address=0.0.0.0:7474 # HTTPS Connector. There can be zero or one HTTPS connectors. dbms.connector.https.enabled=true #dbms.connector.https.listen_address=0.0.0.0:7473
If your Neo4J cluster is running in AWS and you’d like to use private IPs, see the VPC Peering guide.