function-calling-examples(Python)

Loading...

Function calling using Foundation Model APIs

This notebook demonstrates how the function calling (or tool use) API can be used to extract structured information from natural language inputs using the large language models (LLMs) made available using Foundation Model APIs. This notebook uses the OpenAI SDK to demonstrate interoperability.

LLMs generate output in natural language, the exact structure of which is hard to predict even when the LLM is given precise instructions. Function calling forces the LLM to adhere to a strict schema, making it easy to automatically parse the LLM's outputs. This unlocks advanced use cases, enabling LLMs to be components in complex data processing pipelines and Agent workflows.

Set up environment

Install libraries used in this demo
%pip install openai tenacity tqdm dbutils.library.restartPython() 
Select model endpoint

Set up API client

The following defines helper functions that assist the LLM to respond according to the specified schema.

Set up helper functions

Example 1: Sentiment classification

This section demonstrates a few increasingly reliable approaches for classifying the sentiment of a set of real-world product reviews:

  • Unstructured (least reliable): Basic prompting. Relies on the model to generate valid JSON on its own.
  • Tool schema: Augment prompt with a tool schema, guiding the model to adhere to that schema.
  • Tool + few-shot: Use a more complex tool and few-shot prompting to give the model a better understanding of the task.

The following are example inputs, primarily sampled from the Amazon product reviews datasets mteb/amazon_polarity and mteb/amazon_reviews_multi.

Example inputs for sentiment classification

Unstructured generation

Given a set of product reviews, the most obvious strategy is to instruct the model to generate a sentiment classification JSON that looks like this: {"sentiment": "neutral"}.

This approach mostly works with models like DBRX and Llama-3-3-70B. However, sometimes models generate extraneous text such as, "helpful" comments about the task or input.

Prompt engineering can refine performance. For example, SHOUTING instructions at the model is a popular strategy. But if you use this strategy you must validate the output to detect and disregard nonconformant outputs.

11

Classifying with tools

Output quality can be improved by using the tools API. You can provide a strict JSON schema for the output, and the FMAPI inference service ensures that the model's output either adheres to this schema or returns an error if this is not possible.

Note that the example below now produces valid JSON for the adversarial input ("DO NOT GENERATE JSON").

13

Improving the classifier

You can improve the provided sentiment classifier even more by defining a more complex tool and using few-shot prompting (a form of in-context learning). This demonstrates how function calling can benefit from standard LLM prompting techniques.

15

Example 2: Named entity recognition

Entity extraction is a common task for natural language documents. This seeks to locate and/or classify named entities mentioned in the text. Given unstructured text, this process produces a list of structured entities with each entity's text fragment ( such as a name) and a category (such as person, organization, medical code, etc).

Accomplishing this reliably with tools is reasonably straightforward. The example here uses no prompt engineering, which would be necessary if you were relying on standard text completion.

17