๐ Phi-4 Model with AIProjectClient ๐ยถ
Phi-4 is a next-generation open model that aims to provide near GPT-4 capabilities at a fraction of the cost, making it ideal for many enterprise or personal use cases. It's especially great for chain-of-thought reasoning and RAG (Retrieval Augmented Generation) scenarios.
In this notebook, you'll see how to:
- Initialize an
AIProjectClient
for your Azure AI Foundry environment. - Chat with the Phi-4 model using
azure-ai-inference
. - Show a Health & Fitness example, featuring disclaimers and wellness Q&A.
- Enjoy the value proposition of a cheaper alternative to GPT-4 with strong reasoning capabilities. ๐๏ธ
Disclaimer: This is not medical advice. Please consult professionals.
Why Phi-4?ยถ
Phi-4 is a 14B-parameter model trained on curated data for high reasoning performance.
- Cost-Effective: Get GPT-4-level performance for many tasks without the GPT-4 price.
- Reasoning & RAG: Perfect for chain-of-thought reasoning steps and retrieval augmented generation workflows.
- Generous Context Window: 16K tokens, enabling more context or longer user conversations.
1. Setupยถ
Below, we'll install and import the necessary libraries:
- azure-ai-projects: For the
AIProjectClient
. - azure-ai-inference: For calling your model, specifically the chat completions.
- azure-identity: For
DefaultAzureCredential
.
Ensure you have a .env
file with:
PROJECT_CONNECTION_STRING=<your-conn-string>
SERVERLESS_MODEL_NAME=phi-4
Replace <your-conn-string>
with your actual Azure AI Foundry project connection string.
import os
from dotenv import load_dotenv
from pathlib import Path
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.inference.models import SystemMessage, UserMessage, AssistantMessage
load_dotenv()
conn_string = os.getenv("PROJECT_CONNECTION_STRING")
phi4_deployment = os.getenv("SERVERLESS_MODEL_NAME", "phi-4")
try:
project_client = AIProjectClient.from_connection_string(
credential=DefaultAzureCredential(),
conn_str=conn_string,
)
print("โ
AIProjectClient created successfully!")
except Exception as e:
print("โ Error creating AIProjectClient:", e)
2. Chat with Phi-4 ๐ยถ
We'll demonstrate a simple conversation using Phi-4 in a health & fitness context. We'll define a system prompt that clarifies the role of the assistant. Then we'll ask some user queries.
Notice that Phi-4 is well-suited for chain-of-thought reasoning. We'll let it illustrate its reasoning steps for fun.
def chat_with_phi4(user_question, chain_of_thought=False):
"""Send a chat request to the Phi-4 model with optional chain-of-thought."""
# We'll define a system message with disclaimers:
system_prompt = (
"You are a Phi-4 AI assistant, focusing on health and fitness.\n"
"Remind users that you are not a medical professional, but can provide general info.\n"
)
# We can optionally instruct the model to show chain-of-thought. (Use carefully in production.)
if chain_of_thought:
system_prompt += "Please show your step-by-step reasoning in your answer.\n"
# We create messages for system + user.
system_msg = SystemMessage(content=system_prompt)
user_msg = UserMessage(content=user_question)
with project_client.inference.get_chat_completions_client() as chat_client:
response = chat_client.complete(
model=phi4_deployment,
messages=[system_msg, user_msg],
temperature=0.8, # a bit creative
top_p=0.9,
max_tokens=400,
)
return response.choices[0].message.content
# Example usage:
question = "I'm training for a 5K. Any tips on a weekly workout schedule?"
answer = chat_with_phi4(question, chain_of_thought=True)
print("๐ฃ๏ธ User:", question)
print("๐ค Phi-4:", answer)
3. RAG-like Example (Stub)ยถ
Phi-4 also excels in retrieval augmented generation scenarios, where you provide external context and let the model reason over it. Below is a stub example showing how you'd pass retrieved text as context.
In a real scenario, you'd embed & search for relevant passages, then feed them into the user/system message.
def chat_with_phi4_rag(user_question, retrieved_doc):
"""Simulate an RAG flow by appending retrieved context to the system prompt."""
system_prompt = (
"You are Phi-4, helpful fitness AI.\n"
"We have some context from the user's knowledge base: \n"
f"{retrieved_doc}\n"
"Please use this context to help your answer. If the context doesn't help, say so.\n"
)
system_msg = SystemMessage(content=system_prompt)
user_msg = UserMessage(content=user_question)
with project_client.inference.get_chat_completions_client() as chat_client:
response = chat_client.complete(
model=phi4_deployment,
messages=[system_msg, user_msg],
temperature=0.3,
max_tokens=300,
)
return response.choices[0].message.content
# Let's define a dummy doc snippet:
doc_snippet = "Recommended to run 3 times per week and mix with cross-training.\n" \
"Include rest days or active recovery days for muscle repair."
user_q = "How often should I run weekly to prepare for a 5K?"
rag_answer = chat_with_phi4_rag(user_q, doc_snippet)
print("๐ฃ๏ธ User:", user_q)
print("๐ค Phi-4 (RAG):", rag_answer)
4. Wrap-Up & Best Practicesยถ
- Chain-of-Thought: Great for debugging or certain QA tasks, but be mindful about revealing chain-of-thought to end users.
- RAG: Use
azure-ai-inference
with retrieval results to ground your answers. - OpenTelemetry: Optionally integrate
opentelemetry-sdk
andazure-core-tracing-opentelemetry
for full observability. - Evaluate: Use
azure-ai-evaluation
to measure your modelโs performance. - Cost & Performance: Phi-4 aims to provide near GPT-4 performance at lower cost. Evaluate for your domain needs.
๐ Congratulations!ยถ
You've seen how to:
- Use Phi-4 with
AIProjectClient
andazure-ai-inference
. - Create a chat flow with chain-of-thought.
- Stub a RAG scenario.
Happy hacking! ๐๏ธ