Reading LZO Compressed Files

The LZO compression codec is not available by default for licensing reasons. This means that you’ve got to use Init Scripts in order to install it on your cluster at cluster launch time. Please read about Init Scripts before continuing.

The init script that you create will:

  • Install lzo.
  • Copy the hadoop-lzo.jar to proper class path.
  • Configures Spark to use the LZO Compression Codec.