Predict Module
The Predict module is the foundational building block of udspy. It takes a signature and generates outputs from inputs using an LLM.
Overview
Predict is the simplest and most essential module in udspy. It:
- Maps signature inputs to outputs via an LLM
- Supports native tool calling for function execution
- Handles streaming for real-time output generation
- Manages conversation history and message formatting
Basic Usage
from udspy import Predict, Signature, InputField, OutputField
class QA(Signature):
"""Answer questions concisely."""
question: str = InputField()
answer: str = OutputField()
predictor = Predict(QA)
result = predictor(question="What is Python?")
print(result.answer)
String Signatures
For quick prototyping, use string signatures:
See Signatures for more details.
Configuration
Model Selection
# Global default
import udspy
from udspy import LM
lm = LM(model="gpt-4o-mini", api_key="sk-...")
udspy.settings.configure(lm=lm)
# Per-module override
predictor = Predict(QA, model="gpt-4o")
Temperature and Sampling
Custom Adapter
Tool Calling
Predict supports native OpenAI tool calling:
from udspy import tool
from pydantic import Field
@tool(name="search", description="Search for information")
def search(query: str = Field(...)) -> str:
return f"Results for: {query}"
predictor = Predict(QA, tools=[search])
result = predictor(question="What is the weather in Tokyo?")
# Access tool calls
for call in result.tool_calls:
print(f"Called {call['name']} with {call['arguments']}")
Auto-execution vs Manual
By default, tools are NOT auto-executed. You control execution:
# Auto-execute tools
result = await predictor.aexecute(
question="Search for Python",
auto_execute_tools=True
)
# Manual execution
result = await predictor.aexecute(
question="Search for Python",
auto_execute_tools=False
)
# Tools are returned in result.tool_calls but not executed
See Tool Calling for more details.
Streaming
Stream outputs in real-time:
async for event in predictor.aexecute(
stream=True,
question="Explain quantum computing"
):
if isinstance(event, OutputStreamChunk):
print(event.delta, end="", flush=True)
elif isinstance(event, Prediction):
print(f"\n\nFinal: {event.answer}")
Stream Events
The streaming API yields:
- StreamChunk(field: str, delta: str) - Incremental text for a field
- Prediction(**outputs) - Final result with all fields
from udspy import OutputStreamChunk, Prediction
async for event in predictor.aexecute(stream=True, **inputs):
match event:
case OutputStreamChunk(field=field, delta=delta):
print(f"[{field}] {delta}", end="")
case Prediction() as pred:
print(f"\n\nComplete: {pred}")
Execution Methods
Async Execution
# With streaming
async for event in predictor.aexecute(stream=True, question="..."):
...
# Without streaming
result = await predictor.aforward(question="...")
Synchronous Execution
Internally, the synchronous call runs async code using asyncio.run() or the current event loop.
History Management
Predict automatically manages conversation history:
predictor = Predict(QA)
# First call
result1 = predictor(question="What is Python?")
# Second call - includes history
result2 = predictor(question="What are its key features?")
# Access history
for msg in predictor.history:
print(f"{msg.role}: {msg.content}")
# Clear history
predictor.history.clear()
See History API for more details.
Under the Hood
Message Flow
- Signature → Messages: Adapter converts signature to system prompt
- Inputs → User Message: Input fields become user message
- LLM Call: OpenAI API generates response
- Response → Outputs: Parse response into output fields
- History Update: Add messages to history
Field Parsing
Outputs are parsed from the LLM response:
- JSON Mode: If response is valid JSON, parse it
- Field Markers: Look for field boundaries like
answer: ... - Fallback: Extract text content
The Prediction object makes all output fields accessible as attributes:
result = predictor(question="...")
print(result.answer) # Access via attribute
print(result["answer"]) # Access via dict key
Error Handling and Retry
Automatic Retry on Parse Errors
Predict automatically retries when LLM responses fail to parse correctly. This handles common issues like:
- Missing or malformed field markers
- Invalid JSON in structured outputs
- Format inconsistencies
Retry behavior:
- Max attempts: 3 (1 initial + 2 retries)
- Backoff strategy: Exponential backoff (0.1s, 0.2s, up to 3s)
- Only retries: AdapterParseError (not network errors or other exceptions)
- Applies to: Both streaming and non-streaming execution
# This will automatically retry if parse fails
try:
result = predictor(question="...")
except tenacity.RetryError as e:
# All 3 attempts failed
print(f"Failed after retries: {e}")
Why automatic retry? - LLM format errors are usually transient and succeed on retry - Reduces boilerplate error handling code - Improves reliability without user intervention
See ADR-007: Automatic Retry on Parse Errors for full details.
Error Types
AdapterParseError: LLM response doesn't match expected format - Automatically retried up to 3 times - Check signature output fields match what LLM is generating
ValidationError: Output doesn't match Pydantic field types - Not retried (indicates signature mismatch) - Adjust signature types or field descriptions
OpenAI API errors: Network issues, rate limits, etc. - Not retried automatically (use your own retry logic for these) - Consider using exponential backoff for rate limits
Design Rationale
Why Not Auto-execute Tools by Default?
Tools are not auto-executed by default because:
- Safety: User should control when external functions run
- Flexibility: Sometimes you want to inspect/modify tool calls
- Explicit is better: Makes behavior clear and predictable
Why Native Tool Calling?
Unlike DSPy which uses custom adapters, udspy uses OpenAI's native function calling:
- Simplicity: Less code to maintain
- Performance: Optimized by OpenAI
- Reliability: Well-tested and production-ready
- Features: Access to latest tool calling improvements
See ADR-001: Native Tool Calling for full rationale.
Common Patterns
Simple Q&A
Multi-field Output
analyzer = Predict("text -> summary, sentiment, keywords")
result = analyzer(text="I love this product!")
With Context
contextual = Predict("context, question -> answer")
result = contextual(
context="Python is a programming language",
question="What is it used for?"
)
With Tools
@tool(name="calculator")
def calc(expression: str = Field(...)) -> str:
return str(eval(expression))
math_solver = Predict("problem -> solution", tools=[calc])
result = await math_solver.aexecute(
problem="What is 157 * 234?",
auto_execute_tools=True
)
See Also
- Base Module - Module foundation
- ChainOfThought - Add reasoning
- ReAct - Agent with tool usage
- Signatures - Define inputs/outputs
- Tool Calling - Function calling API
- Streaming - Real-time output