Temporal tests (#40)

* temporal tests * codex setup env script to readme
2026-03-16 14:38:08 +01:00 · 2025-05-29 12:56:58 -07:00
parent f7ef2b1c7e
commit e35181b5ad
9 changed files with 1832 additions and 7 deletions
--- a/tests/README.md
+++ b/tests/README.md
@@ -0,0 +1,350 @@
+# Temporal AI Agent - Testing Guide
+
+This directory contains comprehensive tests for the Temporal AI Agent project. The tests cover workflows, activities, and integration scenarios using Temporal's testing framework.
+
+## Test Structure
+
+```
+tests/
+├── README.md                      # This file - testing documentation
+├── conftest.py                    # Test configuration and fixtures
+├── test_agent_goal_workflow.py    # Workflow tests
+├── test_tool_activities.py        # Activity tests
+└── workflowtests/                 # Legacy workflow tests
+    └── agent_goal_workflow_test.py
+```
+
+## Test Types
+
+### 1. Workflow Tests (`test_agent_goal_workflow.py`)
+
+Tests the main `AgentGoalWorkflow` class covering:
+
+- **Workflow Initialization**: Basic workflow startup and state management
+- **Signal Handling**: Testing user_prompt, confirm, end_chat signals
+- **Query Methods**: Testing all workflow query endpoints
+- **State Management**: Conversation history, goal changes, tool data
+- **Validation Flow**: Prompt validation and error handling
+- **Tool Execution Flow**: Confirmation and tool execution cycles
+
+### 2. Activity Tests (`test_tool_activities.py`)
+
+Tests the `ToolActivities` class and `dynamic_tool_activity` function:
+
+- **LLM Integration**: Testing agent_toolPlanner with mocked LLM responses
+- **Validation Logic**: Testing agent_validatePrompt with various scenarios
+- **Environment Configuration**: Testing get_wf_env_vars with different env setups
+- **JSON Processing**: Testing response parsing and sanitization
+- **Dynamic Tool Execution**: Testing the dynamic activity dispatcher
+- **Integration**: End-to-end activity execution in Temporal workers
+
+### 3. Configuration Tests (`conftest.py`)
+
+Provides shared test fixtures and configuration:
+
+- **Temporal Environment**: Local and time-skipping test environments
+- **Sample Data**: Pre-configured agent goals, conversation history, inputs
+- **Test Client**: Configured Temporal client for testing
+
+## Running Tests
+
+### Prerequisites
+
+Ensure you have the required dependencies installed:
+
+```bash
+poetry install --with dev
+```
+
+### Basic Test Execution
+
+Run all tests:
+```bash
+poetry run pytest
+```
+
+Run specific test files:
+```bash
+# Workflow tests only
+poetry run pytest tests/test_agent_goal_workflow.py
+
+# Activity tests only
+poetry run pytest tests/test_tool_activities.py
+
+# Legacy tests
+poetry run pytest tests/workflowtests/
+```
+
+Run with verbose output:
+```bash
+poetry run pytest -v
+```
+
+### Test Environment Options
+
+The tests support different Temporal environments via the `--workflow-environment` flag:
+
+#### Local Environment (Default)
+Uses a local Temporal test server:
+```bash
+poetry run pytest --workflow-environment=local
+```
+
+#### Time-Skipping Environment
+Uses Temporal's time-skipping test environment for faster execution:
+```bash
+poetry run pytest --workflow-environment=time-skipping
+```
+
+#### External Server
+Connect to an existing Temporal server:
+```bash
+poetry run pytest --workflow-environment=localhost:7233
+```
+
+#### Setup Script for AI Agent environments such as OpenAI Codex
+```bash
+export SHELL=/bin/bash
+curl -sSL https://install.python-poetry.org | python3 -
+export PATH="$HOME/.local/bin:$PATH"
+ls
+poetry install --with dev
+cd frontend
+npm install
+cd ..
+
+# Pre-download the temporal test server binary
+poetry run python3 -c "
+import asyncio
+import sys
+from temporalio.testing import WorkflowEnvironment
+
+async def predownload():
+    try:
+        print('Starting test server download...')
+        env = await WorkflowEnvironment.start_time_skipping()
+        print('Test server downloaded and started successfully')
+        await env.shutdown()
+        print('Test server shut down successfully')
+    except Exception as e:
+        print(f'Error during download: {e}')
+        sys.exit(1)
+
+asyncio.run(predownload())
+"
+```
+
+### Filtering Tests
+
+Run tests by pattern:
+```bash
+# Run only validation tests
+poetry run pytest -k "validation"
+
+# Run only workflow tests
+poetry run pytest -k "workflow"
+
+# Run only activity tests
+poetry run pytest -k "activity"
+```
+
+Run tests by marker (if you add custom markers):
+```bash
+# Run only integration tests
+poetry run pytest -m integration
+
+# Skip slow tests
+poetry run pytest -m "not slow"
+```
+
+## Test Configuration
+
+### Test Discovery
+
+The `vibe/` directory is excluded from test collection to avoid conflicts with sample tests. This is configured in `pyproject.toml`:
+
+```toml
+[tool.pytest.ini_options]
+norecursedirs = ["vibe"]
+```
+
+### Environment Variables
+
+Tests respect the following environment variables:
+
+- `LLM_MODEL`: Model to use for LLM testing (defaults to "openai/gpt-4")
+- `LLM_KEY`: API key for LLM service
+- `LLM_BASE_URL`: Custom base URL for LLM service
+- `SHOW_CONFIRM`: Whether to show confirmation dialogs
+- `AGENT_GOAL`: Default agent goal setting
+
+### Mocking Strategy
+
+The tests use extensive mocking to avoid external dependencies:
+
+- **LLM Calls**: Mocked using `unittest.mock` to avoid actual API calls
+- **Tool Handlers**: Mocked to test workflow logic without tool execution
+- **Environment Variables**: Patched for consistent test environments
+
+## Writing New Tests
+
+### Test Naming Convention
+
+- Test files: `test_<module_name>.py`
+- Test classes: `Test<ClassName>`
+- Test methods: `test_<functionality>_<scenario>`
+
+Example:
+```python
+class TestAgentGoalWorkflow:
+    async def test_user_prompt_signal_valid_input(self, client, sample_combined_input):
+        # Test implementation
+        pass
+```
+
+### Using Fixtures
+
+Leverage the provided fixtures for consistent test data:
+
+```python
+async def test_my_workflow(self, client, sample_agent_goal, sample_conversation_history):
+    # client: Temporal test client
+    # sample_agent_goal: Pre-configured AgentGoal
+    # sample_conversation_history: Sample conversation data
+    pass
+```
+
+### Mocking External Dependencies
+
+Always mock external services:
+
+```python
+@patch('activities.tool_activities.completion')
+async def test_llm_integration(self, mock_completion):
+    mock_completion.return_value.choices[0].message.content = '{"test": "response"}'
+    # Test implementation
+```
+
+### Testing Workflow Signals and Queries
+
+```python
+async def test_workflow_signal(self, client, sample_combined_input):
+    # Start workflow
+    handle = await client.start_workflow(
+        AgentGoalWorkflow.run,
+        sample_combined_input,
+        id=str(uuid.uuid4()),
+        task_queue=task_queue_name,
+    )
+    
+    # Send signal
+    await handle.signal(AgentGoalWorkflow.user_prompt, "test message")
+    
+    # Query state
+    conversation = await handle.query(AgentGoalWorkflow.get_conversation_history)
+    
+    # End workflow
+    await handle.signal(AgentGoalWorkflow.end_chat)
+    result = await handle.result()
+```
+
+## Test Data and Fixtures
+
+### Sample Agent Goal
+
+The `sample_agent_goal` fixture provides a basic agent goal with:
+- Goal ID: "test_goal"
+- One test tool with a required string argument
+- Suitable for most workflow testing scenarios
+
+### Sample Conversation History
+
+The `sample_conversation_history` fixture provides:
+- Basic user and agent message exchange
+- Proper message format for testing
+
+### Sample Combined Input
+
+The `sample_combined_input` fixture provides:
+- Complete workflow input with agent goal and tool params
+- Conversation summary and prompt queue
+- Ready for workflow execution
+
+## Debugging Tests
+
+### Verbose Logging
+
+Enable detailed logging:
+```bash
+poetry run pytest --log-cli-level=DEBUG -s
+```
+
+### Temporal Web UI
+
+When using local environment, access Temporal Web UI at http://localhost:8233 to inspect workflow executions during tests.
+
+### Test Isolation
+
+Each test uses unique task queue names to prevent interference:
+```python
+task_queue_name = str(uuid.uuid4())
+```
+
+## Continuous Integration
+
+### GitHub Actions Example
+
+```yaml
+name: Test
+on: [push, pull_request]
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+      - uses: actions/setup-python@v4
+        with:
+          python-version: '3.10'
+      - run: pip install poetry
+      - run: poetry install --with dev
+      - run: poetry run pytest --workflow-environment=time-skipping
+```
+
+### Test Coverage
+
+Generate coverage reports:
+```bash
+poetry add --group dev pytest-cov
+poetry run pytest --cov=workflows --cov=activities --cov-report=html
+```
+
+## Best Practices
+
+1. **Mock External Dependencies**: Always mock LLM calls, file I/O, and network requests
+2. **Use Time-Skipping**: For CI/CD, prefer time-skipping environment for speed
+3. **Unique Identifiers**: Use UUIDs for workflow IDs and task queues
+4. **Clean Shutdown**: Always end workflows properly in tests
+5. **Descriptive Names**: Use clear, descriptive test names
+6. **Test Edge Cases**: Include error scenarios and validation failures
+7. **Keep Tests Fast**: Use mocks to avoid slow external calls
+8. **Isolate Tests**: Ensure tests don't depend on each other
+
+## Troubleshooting
+
+### Common Issues
+
+1. **Workflow Timeout**: Increase timeouts or use time-skipping environment
+2. **Mock Not Working**: Check patch decorators and import paths
+3. **Test Hanging**: Ensure workflows are properly ended with signals
+4. **Environment Issues**: Check environment variable settings
+
+### Getting Help
+
+- Check Temporal Python SDK documentation
+- Review existing test patterns in the codebase
+- Use `poetry run pytest --collect-only` to verify test discovery
+- Run with `-v` flag for detailed output
+
+## Legacy Tests
+
+The `workflowtests/` directory contains legacy tests. New tests should be added to the main `tests/` directory following the patterns established in this guide.