# Temporal AI Agent - Testing Guide This directory contains comprehensive tests for the Temporal AI Agent project. The tests cover workflows, activities, and integration scenarios using Temporal's testing framework. ## Test Structure ``` tests/ ├── README.md # This file - testing documentation ├── conftest.py # Test configuration and fixtures ├── test_agent_goal_workflow.py # Workflow tests ├── test_tool_activities.py # Activity tests └── workflowtests/ # Legacy workflow tests └── agent_goal_workflow_test.py ``` ## Test Types ### 1. Workflow Tests (`test_agent_goal_workflow.py`) Tests the main `AgentGoalWorkflow` class covering: - **Workflow Initialization**: Basic workflow startup and state management - **Signal Handling**: Testing user_prompt, confirm, end_chat signals - **Query Methods**: Testing all workflow query endpoints - **State Management**: Conversation history, goal changes, tool data - **Validation Flow**: Prompt validation and error handling - **Tool Execution Flow**: Confirmation and tool execution cycles ### 2. Activity Tests (`test_tool_activities.py`) Tests the `ToolActivities` class and `dynamic_tool_activity` function: - **LLM Integration**: Testing agent_toolPlanner with mocked LLM responses - **Validation Logic**: Testing agent_validatePrompt with various scenarios - **Environment Configuration**: Testing get_wf_env_vars with different env setups - **JSON Processing**: Testing response parsing and sanitization - **Dynamic Tool Execution**: Testing the dynamic activity dispatcher - **Integration**: End-to-end activity execution in Temporal workers ### 3. Configuration Tests (`conftest.py`) Provides shared test fixtures and configuration: - **Temporal Environment**: Local and time-skipping test environments - **Sample Data**: Pre-configured agent goals, conversation history, inputs - **Test Client**: Configured Temporal client for testing ## Running Tests ### Prerequisites Ensure you have the required dependencies installed: ```bash uv sync ``` ### Basic Test Execution Run all tests: ```bash uv run pytest ``` Run specific test files: ```bash # Workflow tests only uv run pytest tests/test_agent_goal_workflow.py # Activity tests only uv run pytest tests/test_tool_activities.py # Legacy tests uv run pytest tests/workflowtests/ ``` Run with verbose output: ```bash uv run pytest -v ``` ### Test Environment Options The tests support different Temporal environments via the `--workflow-environment` flag: #### Local Environment (Default) Uses a local Temporal test server: ```bash uv run pytest --workflow-environment=local ``` #### Time-Skipping Environment Uses Temporal's time-skipping test environment for faster execution: ```bash uv run pytest --workflow-environment=time-skipping ``` #### External Server Connect to an existing Temporal server: ```bash uv run pytest --workflow-environment=localhost:7233 ``` #### Setup Script for AI Agent environments such as OpenAI Codex ```bash export SHELL=/bin/bash curl -LsSf https://astral.sh/uv/install.sh | sh export PATH="$HOME/.local/bin:$PATH" ls uv sync cd frontend npm install cd .. # Pre-download the temporal test server binary uv run python -c " import asyncio import sys from temporalio.testing import WorkflowEnvironment async def predownload(): try: print('Starting test server download...') env = await WorkflowEnvironment.start_time_skipping() print('Test server downloaded and started successfully') await env.shutdown() print('Test server shut down successfully') except Exception as e: print(f'Error during download: {e}') sys.exit(1) asyncio.run(predownload()) " ``` ### Filtering Tests Run tests by pattern: ```bash # Run only validation tests uv run pytest -k "validation" # Run only workflow tests uv run pytest -k "workflow" # Run only activity tests uv run pytest -k "activity" ``` Run tests by marker (if you add custom markers): ```bash # Run only integration tests uv run pytest -m integration # Skip slow tests uv run pytest -m "not slow" ``` ## Test Configuration ### Test Discovery The `vibe/` directory is excluded from test collection to avoid conflicts with sample tests. This is configured in `pyproject.toml`: ```toml [tool.pytest.ini_options] norecursedirs = ["vibe"] ``` ### Environment Variables Tests respect the following environment variables: - `LLM_MODEL`: Model to use for LLM testing (defaults to "openai/gpt-4") - `LLM_KEY`: API key for LLM service - `LLM_BASE_URL`: Custom base URL for LLM service - `SHOW_CONFIRM`: Whether to show confirmation dialogs - `AGENT_GOAL`: Default agent goal setting ### Mocking Strategy The tests use extensive mocking to avoid external dependencies: - **LLM Calls**: Mocked using `unittest.mock` to avoid actual API calls - **Tool Handlers**: Mocked to test workflow logic without tool execution - **Environment Variables**: Patched for consistent test environments ## Writing New Tests ### Test Naming Convention - Test files: `test_.py` - Test classes: `Test` - Test methods: `test__` Example: ```python class TestAgentGoalWorkflow: async def test_user_prompt_signal_valid_input(self, client, sample_combined_input): # Test implementation pass ``` ### Using Fixtures Leverage the provided fixtures for consistent test data: ```python async def test_my_workflow(self, client, sample_agent_goal, sample_conversation_history): # client: Temporal test client # sample_agent_goal: Pre-configured AgentGoal # sample_conversation_history: Sample conversation data pass ``` ### Mocking External Dependencies Always mock external services: ```python @patch('activities.tool_activities.completion') async def test_llm_integration(self, mock_completion): mock_completion.return_value.choices[0].message.content = '{"test": "response"}' # Test implementation ``` ### Testing Workflow Signals and Queries ```python async def test_workflow_signal(self, client, sample_combined_input): # Start workflow handle = await client.start_workflow( AgentGoalWorkflow.run, sample_combined_input, id=str(uuid.uuid4()), task_queue=task_queue_name, ) # Send signal await handle.signal(AgentGoalWorkflow.user_prompt, "test message") # Query state conversation = await handle.query(AgentGoalWorkflow.get_conversation_history) # End workflow await handle.signal(AgentGoalWorkflow.end_chat) result = await handle.result() ``` ## Test Data and Fixtures ### Sample Agent Goal The `sample_agent_goal` fixture provides a basic agent goal with: - Goal ID: "test_goal" - One test tool with a required string argument - Suitable for most workflow testing scenarios ### Sample Conversation History The `sample_conversation_history` fixture provides: - Basic user and agent message exchange - Proper message format for testing ### Sample Combined Input The `sample_combined_input` fixture provides: - Complete workflow input with agent goal and tool params - Conversation summary and prompt queue - Ready for workflow execution ## Debugging Tests ### Verbose Logging Enable detailed logging: ```bash uv run pytest --log-cli-level=DEBUG -s ``` ### Temporal Web UI When using local environment, access Temporal Web UI at http://localhost:8233 to inspect workflow executions during tests. ### Test Isolation Each test uses unique task queue names to prevent interference: ```python task_queue_name = str(uuid.uuid4()) ``` ## Continuous Integration ### GitHub Actions Example ```yaml name: Test on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: astral-sh/setup-uv@v5 - run: uv sync - run: uv run pytest --workflow-environment=time-skipping ``` ### Test Coverage Generate coverage reports: ```bash uv add --group dev pytest-cov uv run pytest --cov=workflows --cov=activities --cov-report=html ``` ## Best Practices 1. **Mock External Dependencies**: Always mock LLM calls, file I/O, and network requests 2. **Use Time-Skipping**: For CI/CD, prefer time-skipping environment for speed 3. **Unique Identifiers**: Use UUIDs for workflow IDs and task queues 4. **Clean Shutdown**: Always end workflows properly in tests 5. **Descriptive Names**: Use clear, descriptive test names 6. **Test Edge Cases**: Include error scenarios and validation failures 7. **Keep Tests Fast**: Use mocks to avoid slow external calls 8. **Isolate Tests**: Ensure tests don't depend on each other ## Troubleshooting ### Common Issues 1. **Workflow Timeout**: Increase timeouts or use time-skipping environment 2. **Mock Not Working**: Check patch decorators and import paths 3. **Test Hanging**: Ensure workflows are properly ended with signals 4. **Environment Issues**: Check environment variable settings ### Getting Help - Check Temporal Python SDK documentation - Review existing test patterns in the codebase - Use `uv run pytest --collect-only` to verify test discovery - Run with `-v` flag for detailed output ## Legacy Tests The `workflowtests/` directory contains legacy tests. New tests should be added to the main `tests/` directory following the patterns established in this guide.