LZO compressed file

Due to licensing restrictions, the LZO compression codec is not available by default on Databricks clusters. To read an LZO compressed file, you must use an init script to install the codec on your cluster at launch time.

Notebook example: Init LZO compressed files

The following notebook:

  • Builds the LZO codec.

  • Creates an init script that:

    • Installs the LZO compression libraries and the lzop command, and copies the LZO codec to proper class path.

    • Configures Spark to use the LZO compression codec.

Init LZO compressed files notebook

Open notebook in new tab

Notebook example: Read LZO compressed files

The following notebook reads LZO compressed files using the codec installed by the init script:

Read LZO compressed files notebook

Open notebook in new tab