Skip to main content

2. Install the SDK

pip install haliosai

Decorator Pattern

Use the @guarded_chat_completion decorator for automatic guardrail evaluation:
from haliosai import guarded_chat_completion
from openai import AsyncOpenAI

@guarded_chat_completion(agent_id="your-agent-id")
async def call_llm(messages):
    """Decorator automatically handles guardrail evaluation"""
    client = AsyncOpenAI()
    return await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        max_tokens=100
    )

Processing Modes

Parallel Processing (Default) - Guardrails run concurrently with LLM:
@guarded_chat_completion(agent_id="your-agent-id")
async def parallel_call(messages):
    # Guardrails run at the same time as LLM call
    # Faster but less conservative
Sequential Processing - Guardrails complete before LLM:
@guarded_chat_completion(agent_id="your-agent-id", concurrent_guardrail_processing=False)
async def sequential_call(messages):
    # Guardrails run before LLM call
    # Safer but potentially slower

Context Manager Pattern

Use HaliosGuard for manual control over guardrail evaluation:
from haliosai import HaliosGuard
from openai import AsyncOpenAI

async def process_with_guardrails():
    async with HaliosGuard(agent_id="your-agent-id") as guard:
        client = AsyncOpenAI()
        messages = [{"role": "user", "content": "Hello!"}]
        
        # Evaluate request before LLM call
        req_result = await guard.evaluate(messages, "request")
        if req_result.get("guardrails_triggered", 0) > 0:
            return "Request blocked by guardrails"
        
        # Proceed with LLM call
        response = await client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            max_tokens=200
        )
        
        assistant_message = response.choices[0].message.content
        
        # Evaluate response after LLM call
        resp_messages = messages + [{"role": "assistant", "content": assistant_message}]
        resp_result = await guard.evaluate(resp_messages, "response")
        if resp_result.get("guardrails_triggered", 0) > 0:
            return "Response blocked by guardrails"
        
        return assistant_message

Key Features

  • Flexible Processing: Parallel (faster) or sequential (safer) modes
  • Async Support: Full asyncio compatibility with OpenAI client
  • Minimal Changes: Decorator requires almost no code modification
  • Manual Control: Context manager gives full evaluation control
  • Error Handling: Comprehensive error management and logging

Trace Context and Span Management

The SDK supports distributed tracing with automatic span creation and propagation. You can pass explicit trace context for observability or let the SDK create it automatically.

Automatic Trace Creation (Default)

If no trace context is provided, the SDK automatically creates traces and spans:
from haliosai import guarded_chat_completion

@guarded_chat_completion(agent_id="your-agent-id")
async def call_llm(messages):
    # SDK automatically creates trace_id and span_id
    # Each call gets its own trace with nested spans for request/response
    return await client.chat.completions.create(...)

Explicit Trace Context

Pass TraceContext to control trace hierarchy and span relationships:
from haliosai import guarded_chat_completion, TraceContext

# Create a trace context for a conversation session
trace_context = TraceContext.create(conversation_id="user-session-123")

@guarded_chat_completion(agent_id="your-agent-id")
async def call_llm(messages, trace_context=None):
    # Pass trace_context to maintain trace continuity
    return await client.chat.completions.create(..., trace_context=trace_context)

Advanced Span Control with Context Manager

Use HaliosGuard for fine-grained span management:
from haliosai import HaliosGuard, TraceContext

async def process_conversation():
    # Create trace for entire conversation
    trace_context = TraceContext.create(conversation_id="conv-123")
    
    async with HaliosGuard(agent_id="your-agent-id", trace_context=trace_context) as guard:
        # Start custom spans for different operations
        with guard.start_span("user_input_processing"):
            # Process user input
            pass
            
        with guard.start_span("llm_call"):
            # Make LLM call
            response = await guard.guarded_call_parallel(messages, llm_call)
            
        return response

Trace Context Parameters

When creating TraceContext, you can specify:
  • trace_id: Custom 32-character hex string for the trace (auto-generated if not provided)
  • conversation_id: Human-readable identifier for grouping related spans
  • span_id: Custom span identifier (auto-generated if not provided)
  • parent_span_id: Parent span ID for hierarchical relationships (auto-generated if not provided)
from haliosai import TraceContext

# Fully custom trace context
trace = TraceContext.create(
    trace_id="a1b2c3d4e5f678901234567890abcdef",  # 32-char hex
    conversation_id="user-123-session-456",
    span_id="span-001",  # Custom span ID
    parent_span_id="parent-span-999"  # For nested spans
)

When to Use Explicit Trace Context

  • Multi-turn conversations: Maintain trace continuity across turns
  • Distributed systems: Propagate traces across service boundaries
  • Custom observability: Control span naming and hierarchy
  • Debugging: Correlate logs and metrics with specific operations

Examples

See the SDK examples for different implementation patterns: