Otimize os prompts usando avaliadores personalizados.
Este Notebook mostra como criar avaliadores personalizados usando MLflow make_judge.
Muitas vezes, sistemas integrados de pontuação e avaliação não se adequam a todos os casos de uso. Utilize avaliadores ou juízes personalizados para garantir avaliações precisas e, assim, otimizar seus resultados.
O Notebook orienta você no uso de um avaliador de Markdown que otimiza um prompt para gerar uma saída em um formato mais adequado ao Markdown.
%pip install --upgrade mlflow databricks-sdk dspy openai
dbutils.library.restartPython()
Use o MLflow make_judge
O lançamento recente do make_judge pelo MLflow permite que você crie qualquer modelo de juiz personalizado para o seu caso de uso específico.
from mlflow.genai.judges import make_judge
# Create a scorer for customer support quality
markdown_output_judge = make_judge(
name="markdown_quality",
instructions=(
"Evaluate if the answer in {{ outputs }} follows a markdown formatting and accurately answers the question in {{ inputs }} and matches {{ expectations }}. Rate as high, medium or low quality"
),
model="databricks:/databricks-claude-sonnet-4-5"
)
Função objetivo para mapear o feedback
O feedback fornecido pelo juiz precisa ser convertido em um número que o otimizador possa usar. O otimizador também incorpora o feedback do juiz.
Você precisa de uma função para fornecer esse mapeamento de volta ao otimizador.
def feedback_to_score(scores: dict) -> float:
"""Convert feedback values to numerical scores."""
feedback_value = scores["markdown_quality"]
# Map categorical feedback to numerical values
feedback_mapping = {
"high": 1.0,
"medium": 0.5,
"low": 0.0
}
# Handle Feedback objects by accessing .value attribute
if hasattr(feedback_value, 'value'):
feedback_str = str(feedback_value.value).lower()
else:
feedback_str = str(feedback_value).lower()
return feedback_mapping.get(feedback_str, 0.0)
Teste o modelo
Você pode testar este modelo tal como está. No exemplo a seguir, o modelo não gera saída no formato Markdown.
import mlflow
import openai
from mlflow.genai.optimize import GepaPromptOptimizer
from databricks.sdk import WorkspaceClient
# Initialize the Databricks workspace client
w = WorkspaceClient()
# Change this to your workspace catalog and schema
catalog = ""
schema = ""
prompt_location = f"{catalog}.{schema}.markdown"
openai_client = w.serving_endpoints.get_open_ai_client()
# Register initial prompt
prompt = mlflow.genai.register_prompt(
name=prompt_location,
template="Answer this question: {{question}}",
)
# Define your prediction function
def predict_fn(question: str) -> str:
prompt = mlflow.genai.load_prompt(f"prompts:/{prompt_location}/1")
completion = openai_client.chat.completions.create(
model="databricks-gpt-oss-20b",
messages=[{"role": "user", "content": prompt.format(question=question)}],
)
return completion.choices[0].message.content
from IPython.display import Markdown
output = predict_fn("What is the capital of France?")
Markdown(output[1]['text'])
execução o otimizador
Alguns dados de exemplo foram fornecidos para você.
# Training data with inputs and expected outputs
dataset = [
{
# The inputs schema should match with the input arguments of the prediction function.
"inputs": {"question": "What is the capital of France?"},
"expectations": {"expected_response": """## Paris - Capital of France
**Paris** is the capital and largest city of France, located in the *north-central* region.
### Key Facts:
- **Population**: ~2.2 million (city), ~12 million (metro area)
- **Founded**: 3rd century BC
- **Nickname**: *"City of Light"* (La Ville Lumière)
### Notable Landmarks:
1. **Eiffel Tower** - Iconic iron lattice tower
2. **Louvre Museum** - World's largest art museum
3. **Notre-Dame Cathedral** - Gothic masterpiece
4. **Arc de Triomphe** - Monument honoring French soldiers
> Paris is not only the political center but also a global hub for art, fashion, and culture."""},
},
{
"inputs": {"question": "What is the capital of Germany?"},
"expectations": {"expected_response": """## Berlin - Capital of Germany
**Berlin** is Germany's capital and largest city, situated in the *northeastern* part of the country.
### Historical Significance:
| Period | Importance |
|--------|------------|
| 1961-1989 | Divided by the **Berlin Wall** |
| 1990 | Reunification capital |
| Present | Political & cultural center |
### Must-See Attractions:
1. **Brandenburg Gate** - Neoclassical monument
2. **Reichstag Building** - Seat of German Parliament
3. **Museum Island** - UNESCO World Heritage site
4. **East Side Gallery** - Open-air gallery on Berlin Wall remnants
> *"Ich bin ein Berliner"* - Famous quote by JFK highlighting Berlin's symbolic importance during the Cold War."""},
},
{
"inputs": {"question": "What is the capital of Japan?"},
"expectations": {"expected_response": """## Tokyo (東京) - Capital of Japan
**Tokyo** is the capital of Japan and the world's most populous metropolitan area, located on the *eastern coast* of Honshu island.
### Demographics & Economy:
- **Population**: ~14 million (city), ~37 million (Greater Tokyo Area)
- **GDP**: One of the world's largest urban economies
- **Status**: Global financial hub and technology center
### Districts & Landmarks:
1. **Shibuya** - Famous crossing and youth culture
2. **Shinjuku** - Business district with Tokyo Metropolitan Government Building
3. **Asakusa** - Historic area with *Sensō-ji Temple*
4. **Akihabara** - Electronics and anime culture hub
### Cultural Blend:
- Ancient temples ⛩️ alongside futuristic skyscrapers 🏙️
- Traditional tea ceremonies 🍵 and cutting-edge technology 🤖
> Tokyo seamlessly combines **centuries-old traditions** with *ultra-modern innovation*, making it a unique global metropolis."""},
},
{
"inputs": {"question": "What is the capital of Italy?"},
"expectations": {"expected_response": """## Rome (Roma) - The Eternal City
**Rome** is the capital of Italy, famously known as *"The Eternal City"* (*La Città Eterna*), with over **2,750 years** of history.
### Historical Timeline:
753 BC → Founded (according to legend)
27 BC → Capital of Roman Empire
1871 → Capital of unified Italy
Present → Modern capital with ancient roots
### UNESCO World Heritage Sites:
1. **The Colosseum** - Ancient amphitheater (80 AD)
2. **Roman Forum** - Center of ancient Roman life
3. **Pantheon** - Best-preserved ancient Roman building
4. **Vatican City** - Independent city-state within Rome
- *St. Peter's Basilica*
- *Sistine Chapel* (Michelangelo's ceiling)
### Famous Quote:
> *"All roads lead to Rome"* - Ancient proverb reflecting Rome's historical importance as the center of the Roman Empire
### Cultural Significance:
- Birthplace of **Western civilization**
- Center of the *Catholic Church*
- Home to countless masterpieces of ***Renaissance art and architecture***"""},
},
]
# Optimize the prompt
result = mlflow.genai.optimize_prompts(
predict_fn=predict_fn,
train_data=dataset,
prompt_uris=[prompt.uri],
optimizer=GepaPromptOptimizer(reflection_model="databricks:/databricks-claude-sonnet-4-5"),
scorers=[markdown_output_judge],
aggregation=feedback_to_score
)
# Use the optimized prompt
optimized_prompt = result.optimized_prompts[0]
print(f"Optimized template: {optimized_prompt.template}")
Analise suas instruções.
Abra o link para o seu experimento MLflow e siga os passos abaixo para que os prompts apareçam no seu experimento:
- Certifique-se de que o tipo de experimento esteja definido como "Aplicativos e agentes GenAI".
- Acesse a tab de prompts.
- Clique em "Selecionar um esquema" no canto superior direito e insira o mesmo esquema que você definiu acima para ver o seu prompt.
Carregue o novo prompt e teste novamente.
Analise a aparência do prompt e carregue-o na sua função de previsão para ver como o modelo se comporta de forma diferente.
from IPython.display import Markdown
prompt = mlflow.genai.load_prompt(f"prompts:/{prompt_location}/10")
Markdown(prompt.template)
from IPython.display import Markdown
def predict_fn(question: str) -> str:
prompt = mlflow.genai.load_prompt(f"prompts:/{prompt_location}/10")
completion = openai_client.chat.completions.create(
model="databricks-gpt-oss-20b",
# load prompt template using PromptVersion.format()
messages=[{"role": "user", "content": prompt.format(question=question)}],
)
return completion.choices[0].message.content
output = predict_fn("What is the capital of France?.")
Markdown(output[1]['text'])
Exemplo de caderno
Segue abaixo um Notebook executável que demonstra a otimização de prompts usando avaliadores personalizados.