Model inference using TensorFlow and TensorRT

The notebook in this article demonstrates the Databricks recommended deep learning inference workflow with TensorFlow and TensorFlowRT. This example shows how to optimize a trained ResNet-50 model with TensorRT for model inference.

NVIDIA TensorRT is a high-performance inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT is installed in the GPU-enabled version of Databricks Runtime 7.0 (Unsupported) and above.

Databricks recommends that you use the G4 instance type series, which is optimized for deploying machine learning models in production.

Model inference TensorFlow-TensorRT notebook

