Connect to John Snow Labs

John Snow Labs provides production-grade, scalable, and trainable versions of the latest research in natural language processing (NLP) through the following products:

  • Spark NLP: state-of-the-art NLP for Python, Java, or Scala.

  • Spark NLP for Healthcare: state-of-the-art clinical and biomedical NLP.

  • Spark OCR: a scalable, private, and highly accurate OCR and de-identification library.

You can integrate your Databricks clusters with John Snow Labs.

Note

John Snow Labs does not integrate with Databricks SQL warehouses (formerly Databricks SQL endpoints).

Connect to John Snow Labs using Partner Connect

The Partner Connect steps cover the most popular NLP and OCR tasks:

  • Create a new cluster in your Databricks workspace.

  • Automatically install John Snow Labs NLP and OCR libraries on the new cluster.

  • Create and deploy a 30-day trial license for John Snow Labs NLP and OCR libraries.

  • Copy 20+ ready-to-use Python notebooks to the new cluster.

Differences between standard connections and John Snow Labs

To connect to John Snow Labs using Partner Connect, you follow the steps in Connect to a machine learning partner using Partner Connect. The John Snow Labs connection is different from standard machine learning connections in the following ways:

  • To complete the Partner Connect steps, you need a valid credit card. Your credit card is subject to pay-as-you-go charges that begin after the trial ends.

  • After you follow the on-screen instructions to start your John Snow Labs NLP trial, check your email inbox for a message from John Snow Labs that contains instructions about how to get started, then follow the instructions in the message. It could take up to a half hour for this message to arrive.

Steps to connect

To connect your Databricks workspace to John Snow Labs using Partner Connect, see Connect to a machine learning partner using Partner Connect.

Connect to John Snow Labs manually

Follow these instructions to automatically install the John Snow Labs NLP and OCR libraries and notebooks on your cluster, and to activate your trial of John Snow Labs if you do not already have a John Snow Labs account.

Requirements

Before you integrate with John Snow Labs, you must have the following:

Procedure

To integrate with John Snow Labs, complete these steps:

  1. Make sure you meet the requirements for John Snow Labs.

  2. Go to the John Snow Labs NLP on Databricks webpage.

  3. Click Install in my Databricks account.

  4. In the Please tell us about yourself dialog, enter your first name, last name, and company email address.

  5. For Databricks instance url, enter your Databricks workspace URL, for example https://dbc-a1b2345c-cloud.databricks.com/?o=1234567890123456.

  6. For Databricks access token, enter your Databricks personal access token value from the requirements.

  7. Click Test connection.

  8. After the connection succeeds, for Choose a cluster to install on, select the cluster from the requirements.

  9. Click Get Trial License.

  10. Check your email inbox for a message from John Snow Labs that contains a request to validate your email address.

  11. In the message, click Validate my email.

  12. After several minutes, check your email inbox again for another message from John Snow Labs that contains instructions about how to get started. Note that in some cases it could take up to a half hour for this message to arrive.

  13. Follow the instructions in the message.

    Note

    To manually install the John Snow Labs libraries and notebooks on your cluster, see the following on the John Snow Labs website:

  14. To upgrade your trial of John Snow Labs, sign in to your John Snow Labs account, at https://my.johnsnowlabs.com/login.

  15. Continue with Next steps.