Skip to main content

Document Parsing

Preview

This feature is in Public Preview and is HIPAA compliant.

Document Parsing uses state-of-the-art research techniques to extract and visualize structured data from a wide range of document types, including but not limited to PDFs, images, Word documents (DOC/DOCX), and PowerPoint files (PPT/PPTX). It's designed to handle complex layouts such as tables, charts, and mixed text-image content.

Document Parsing is built on the ai_parse_document function and includes a UI that allows you to parse documents and immediately inspect their structure through formatted text or structured JSON outputs.

Requirements

Parse documents

Use Document Parsing to parse your documents and visualize their structure.

  1. Go to Agents icon. Agents in the left navigation pane of your workspace.
  2. Click Create Agent > Document Parsing.
  3. Select your source document. You can choose to upload a file or select one from an existing Unity Catalog catalog. Supported formats include: PDF, images, DOC/DOCX, and PPT/PPTX.
  4. Click Parse document.

Parsing your document can take a few minutes. When complete, Document Parsing shows the source document on the left and the parsed document on the right. You can choose to view the parsed document as Formatted text or Raw JSON.

Document parsing UI showing source and parsed document side by side

Process and query results

To view the ai_parse_document query and run it on more documents, click Use Agent and choose either to run the query from the SQL Editor or Notebook. You can edit the query to point to the volume or table your documents live in.

Document Parsing provides a UI interface to the SQL function ai_parse_document. See the ai_parse_document reference page for more advanced examples and details.

Limitations

See ai_parse_document limitations.