Files
temporal-ai-agent/tests
Dan Davison 68ac9c40eb Migrate to uv (#52)
* uvx migrate-to-uv

* uv migration

* Fix hatch build

* Fixup

* uv run

* Add tab completion to devcontainer uv

Co-authored-by: Simon Emms <simon@simonemms.com>

* Revert "Add tab completion to devcontainer uv"

This reverts commit a3b7bdd84b.

---------

Co-authored-by: Simon Emms <simon@simonemms.com>
2025-07-30 11:37:42 -06:00
..
2025-04-10 09:38:13 -04:00
2025-07-30 11:37:42 -06:00

Temporal AI Agent - Testing Guide

This directory contains comprehensive tests for the Temporal AI Agent project. The tests cover workflows, activities, and integration scenarios using Temporal's testing framework.

Test Structure

tests/
├── README.md                      # This file - testing documentation
├── conftest.py                    # Test configuration and fixtures
├── test_agent_goal_workflow.py    # Workflow tests
├── test_tool_activities.py        # Activity tests
└── workflowtests/                 # Legacy workflow tests
    └── agent_goal_workflow_test.py

Test Types

1. Workflow Tests (test_agent_goal_workflow.py)

Tests the main AgentGoalWorkflow class covering:

  • Workflow Initialization: Basic workflow startup and state management
  • Signal Handling: Testing user_prompt, confirm, end_chat signals
  • Query Methods: Testing all workflow query endpoints
  • State Management: Conversation history, goal changes, tool data
  • Validation Flow: Prompt validation and error handling
  • Tool Execution Flow: Confirmation and tool execution cycles

2. Activity Tests (test_tool_activities.py)

Tests the ToolActivities class and dynamic_tool_activity function:

  • LLM Integration: Testing agent_toolPlanner with mocked LLM responses
  • Validation Logic: Testing agent_validatePrompt with various scenarios
  • Environment Configuration: Testing get_wf_env_vars with different env setups
  • JSON Processing: Testing response parsing and sanitization
  • Dynamic Tool Execution: Testing the dynamic activity dispatcher
  • Integration: End-to-end activity execution in Temporal workers

3. Configuration Tests (conftest.py)

Provides shared test fixtures and configuration:

  • Temporal Environment: Local and time-skipping test environments
  • Sample Data: Pre-configured agent goals, conversation history, inputs
  • Test Client: Configured Temporal client for testing

Running Tests

Prerequisites

Ensure you have the required dependencies installed:

uv sync

Basic Test Execution

Run all tests:

uv run pytest

Run specific test files:

# Workflow tests only
uv run pytest tests/test_agent_goal_workflow.py

# Activity tests only
uv run pytest tests/test_tool_activities.py

# Legacy tests
uv run pytest tests/workflowtests/

Run with verbose output:

uv run pytest -v

Test Environment Options

The tests support different Temporal environments via the --workflow-environment flag:

Local Environment (Default)

Uses a local Temporal test server:

uv run pytest --workflow-environment=local

Time-Skipping Environment

Uses Temporal's time-skipping test environment for faster execution:

uv run pytest --workflow-environment=time-skipping

External Server

Connect to an existing Temporal server:

uv run pytest --workflow-environment=localhost:7233

Setup Script for AI Agent environments such as OpenAI Codex

export SHELL=/bin/bash
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"
ls
uv sync
cd frontend
npm install
cd ..

# Pre-download the temporal test server binary
uv run python -c "
import asyncio
import sys
from temporalio.testing import WorkflowEnvironment

async def predownload():
    try:
        print('Starting test server download...')
        env = await WorkflowEnvironment.start_time_skipping()
        print('Test server downloaded and started successfully')
        await env.shutdown()
        print('Test server shut down successfully')
    except Exception as e:
        print(f'Error during download: {e}')
        sys.exit(1)

asyncio.run(predownload())
"

Filtering Tests

Run tests by pattern:

# Run only validation tests
uv run pytest -k "validation"

# Run only workflow tests
uv run pytest -k "workflow"

# Run only activity tests
uv run pytest -k "activity"

Run tests by marker (if you add custom markers):

# Run only integration tests
uv run pytest -m integration

# Skip slow tests
uv run pytest -m "not slow"

Test Configuration

Test Discovery

The vibe/ directory is excluded from test collection to avoid conflicts with sample tests. This is configured in pyproject.toml:

[tool.pytest.ini_options]
norecursedirs = ["vibe"]

Environment Variables

Tests respect the following environment variables:

  • LLM_MODEL: Model to use for LLM testing (defaults to "openai/gpt-4")
  • LLM_KEY: API key for LLM service
  • LLM_BASE_URL: Custom base URL for LLM service
  • SHOW_CONFIRM: Whether to show confirmation dialogs
  • AGENT_GOAL: Default agent goal setting

Mocking Strategy

The tests use extensive mocking to avoid external dependencies:

  • LLM Calls: Mocked using unittest.mock to avoid actual API calls
  • Tool Handlers: Mocked to test workflow logic without tool execution
  • Environment Variables: Patched for consistent test environments

Writing New Tests

Test Naming Convention

  • Test files: test_<module_name>.py
  • Test classes: Test<ClassName>
  • Test methods: test_<functionality>_<scenario>

Example:

class TestAgentGoalWorkflow:
    async def test_user_prompt_signal_valid_input(self, client, sample_combined_input):
        # Test implementation
        pass

Using Fixtures

Leverage the provided fixtures for consistent test data:

async def test_my_workflow(self, client, sample_agent_goal, sample_conversation_history):
    # client: Temporal test client
    # sample_agent_goal: Pre-configured AgentGoal
    # sample_conversation_history: Sample conversation data
    pass

Mocking External Dependencies

Always mock external services:

@patch('activities.tool_activities.completion')
async def test_llm_integration(self, mock_completion):
    mock_completion.return_value.choices[0].message.content = '{"test": "response"}'
    # Test implementation

Testing Workflow Signals and Queries

async def test_workflow_signal(self, client, sample_combined_input):
    # Start workflow
    handle = await client.start_workflow(
        AgentGoalWorkflow.run,
        sample_combined_input,
        id=str(uuid.uuid4()),
        task_queue=task_queue_name,
    )
    
    # Send signal
    await handle.signal(AgentGoalWorkflow.user_prompt, "test message")
    
    # Query state
    conversation = await handle.query(AgentGoalWorkflow.get_conversation_history)
    
    # End workflow
    await handle.signal(AgentGoalWorkflow.end_chat)
    result = await handle.result()

Test Data and Fixtures

Sample Agent Goal

The sample_agent_goal fixture provides a basic agent goal with:

  • Goal ID: "test_goal"
  • One test tool with a required string argument
  • Suitable for most workflow testing scenarios

Sample Conversation History

The sample_conversation_history fixture provides:

  • Basic user and agent message exchange
  • Proper message format for testing

Sample Combined Input

The sample_combined_input fixture provides:

  • Complete workflow input with agent goal and tool params
  • Conversation summary and prompt queue
  • Ready for workflow execution

Debugging Tests

Verbose Logging

Enable detailed logging:

uv run pytest --log-cli-level=DEBUG -s

Temporal Web UI

When using local environment, access Temporal Web UI at http://localhost:8233 to inspect workflow executions during tests.

Test Isolation

Each test uses unique task queue names to prevent interference:

task_queue_name = str(uuid.uuid4())

Continuous Integration

GitHub Actions Example

name: Test
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v5
      - run: uv sync
      - run: uv run pytest --workflow-environment=time-skipping

Test Coverage

Generate coverage reports:

uv add --group dev pytest-cov
uv run pytest --cov=workflows --cov=activities --cov-report=html

Best Practices

  1. Mock External Dependencies: Always mock LLM calls, file I/O, and network requests
  2. Use Time-Skipping: For CI/CD, prefer time-skipping environment for speed
  3. Unique Identifiers: Use UUIDs for workflow IDs and task queues
  4. Clean Shutdown: Always end workflows properly in tests
  5. Descriptive Names: Use clear, descriptive test names
  6. Test Edge Cases: Include error scenarios and validation failures
  7. Keep Tests Fast: Use mocks to avoid slow external calls
  8. Isolate Tests: Ensure tests don't depend on each other

Troubleshooting

Common Issues

  1. Workflow Timeout: Increase timeouts or use time-skipping environment
  2. Mock Not Working: Check patch decorators and import paths
  3. Test Hanging: Ensure workflows are properly ended with signals
  4. Environment Issues: Check environment variable settings

Getting Help

  • Check Temporal Python SDK documentation
  • Review existing test patterns in the codebase
  • Use uv run pytest --collect-only to verify test discovery
  • Run with -v flag for detailed output

Legacy Tests

The workflowtests/ directory contains legacy tests. New tests should be added to the main tests/ directory following the patterns established in this guide.