merged old agent goal in with keynote goal

2026-03-16 22:48:09 +01:00 · 2025-02-20 15:30:54 -08:00
parent ed069d9521
commit 08672d79e3
10 changed files with 500 additions and 103 deletions
--- a/README.md
+++ b/README.md
@@ -1,6 +1,8 @@
 # Temporal AI Agent

-This demo shows a multi-turn conversation with an AI agent running inside a Temporal workflow. The purpose of the agent is to collect information towards a goal, running tools along the way. There's a simple DSL input for collecting information (currently set up to use mock functions to search for public events, search for flights around those events, then create a test Stripe invoice for the trip). The AI will respond with clarifications and ask for any missing information to that goal. You can configure it to use [ChatGPT 4o](https://openai.com/index/hello-gpt-4o/), [Anthropic Claude](https://www.anthropic.com/claude), [Google Gemini](https://gemini.google.com), [Deepseek-V3](https://www.deepseek.com/) or a local LLM of your choice using [Ollama](https://ollama.com).
+This demo shows a multi-turn conversation with an AI agent running inside a Temporal workflow. The purpose of the agent is to collect information towards a goal, running tools along the way. There's a simple DSL input for collecting information (currently set up to use mock functions to search for public events, search for flights around those events, then create a test Stripe invoice for the trip).
+
+The AI will respond with clarifications and ask for any missing information to that goal. You can configure it to use [ChatGPT 4o](https://openai.com/index/hello-gpt-4o/), [Anthropic Claude](https://www.anthropic.com/claude), [Google Gemini](https://gemini.google.com), [Deepseek-V3](https://www.deepseek.com/) or a local LLM of your choice using [Ollama](https://ollama.com).

 [Watch the demo (5 minute YouTube video)](https://www.youtube.com/watch?v=GEXllEH2XiQ)

@@ -14,6 +16,43 @@ This application uses `.env` files for configuration. Copy the [.env.example](.e
 cp .env.example .env
 ```

+### Agent Goal Configuration
+
+The agent can be configured to pursue different goals using the `AGENT_GOAL` environment variable in your `.env` file.
+
+#### Goal: Find an event in APAC, book flights to it and invoice the user for the cost
+- `AGENT_GOAL=goal_event_flight_invoice` (default) - Helps users find events, book flights, and arrange train travel with invoice generation
+    - This is the scenario in the video above
+
+#### Goal: Find a Premier League match, book train tickets to it and invoice the user for the cost
+- `AGENT_GOAL=goal_match_train_invoice` - Focuses on Premier League match attendance with train booking and invoice generation
+    - This is a new goal that is part of an upcoming conference talk
+
+If not specified, the agent defaults to `goal_event_flight_invoice`. Each goal comes with its own set of tools and conversation flows designed for specific use cases. You can examine `tools/goal_registry.py` to see the detailed configuration of each goal.
+
+See the next section for tool configuration for each goal.
+
+### Tool Configuration
+
+#### Agent Goal: goal_event_flight_invoice (default)
+* The agent uses a mock function to search for events. This has zero configuration.
+* By default the agent uses a mock function to search for flights.
+    * If you want to use the real flights API, go to `tools/search_flights.py` and replace the `search_flights` function with `search_flights_real_api` that exists in the same file.
+    * It's free to sign up at [RapidAPI](https://rapidapi.com/apiheya/api/sky-scrapper)
+    * This api might be slow to respond, so you may want to increase the start to close timeout, `TOOL_ACTIVITY_START_TO_CLOSE_TIMEOUT` in `workflows/workflow_helpers.py`
+* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in .env
+    * It's free to sign up and get a key at [Stripe](https://stripe.com/)
+    * If you're lazy go to `tools/create_invoice.py` and replace the `create_invoice` function with the mock `create_invoice_example` that exists in the same file.
+
+#### Agent Goal: goal_match_train_invoice
+
+* Finding a match requires a key from [Football Data](https://www.football-data.org). Sign up for a free account, then see the 'My Account' page to get your API token. Set `FOOTBALL_DATA_API_KEY` to this value.
+* We use a mock function to search for trains. Start the train API server to use the real API: `python thirdparty/train_api.py`
+* * The train activity is 'enterprise' so it's written in C# and requires a .NET runtime. See the [.NET backend](#net-(enterprise)-backend) section for details on running it.
+* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in .env
+    * It's free to sign up and get a key at [Stripe](https://stripe.com/)
+    * If you're lazy go to `tools/create_invoice.py` and replace the `create_invoice` function with the mock `create_invoice_example` that exists in the same file.
+
 ### LLM Provider Configuration

 The agent can use OpenAI's GPT-4o, Google Gemini, Anthropic Claude, or a local LLM via Ollama. Set the `LLM_PROVIDER` environment variable in your `.env` file to choose the desired provider:
@@ -35,7 +74,9 @@ To use Google Gemini:
 1. Obtain a Google API key and set it in the `GOOGLE_API_KEY` environment variable in `.env`.
 2. Set `LLM_PROVIDER=google` in your `.env` file.

-### Option 3: Anthropic Claude
+### Option 3: Anthropic Claude (recommended)
+
+I find that Claude Sonnet 3.5 performs better than the other hosted LLMs for this use case.

 To use Anthropic:

@@ -61,15 +102,6 @@ To use a local LLM with Ollama:

 Note: I found the other (hosted) LLMs to be MUCH more reliable for this use case. However, you can switch to Ollama if desired, and choose a suitably large model if your computer has the resources.

-## Agent Tools
-* Requires a Rapidapi key for sky-scrapper (how we find flights). Set this in the `RAPIDAPI_KEY` environment variable in .env
-    * It's free to sign up and get a key at [RapidAPI](https://rapidapi.com/apiheya/api/sky-scrapper)
-    * If you're lazy go to `tools/search_flights.py` and replace the `get_flights` function with the mock `search_flights_example` that exists in the same file.
-* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in .env
-    * It's free to sign up and get a key at [Stripe](https://stripe.com/)
-    * If you're lazy go to `tools/create_invoice.py` and replace the `create_invoice` function with the mock `create_invoice_example` that exists in the same file.
-* Requires a key from [Football Data](https://www.football-data.org). Sign up for a free account, then see the 'My Account' page to get your API token. Set `FOOTBALL_DATA_API_KEY` to this value.
-
 ## Configuring Temporal Connection

 By default, this application will connect to a local Temporal server (`localhost:7233`) in the default namespace, using the `agent-task-queue` task queue. You can override these settings in your `.env` file.
@@ -115,24 +147,6 @@ poetry run uvicorn api.main:app --reload
 ```
 Access the API at `/docs` to see the available endpoints.

-### Python Search Trains API
-Required to search and book trains!
-```bash
-poetry run python thirdparty/train_api.py
-
-# example url
-# http://localhost:8080/api/search?from=london&to=liverpool&outbound_time=2025-04-18T09:00:00&inbound_time=2025-04-20T09:00:00
-```
-
-### .NET (enterprise) Backend ;)
-We have activities written in C# to call the train APIs.
-```bash
-cd enterprise
-dotnet build # ensure you brew install dotnet@8 first!
-dotnet run
-```
-If you're running your train API above on a different host/port then change the API URL in `Program.cs`.
-
 ### React UI
 Start the frontend:
 ```bash
@@ -142,29 +156,36 @@ npx vite
 ```
 Access the UI at `http://localhost:5173`

+### Python Search Trains API
+> Agent Goal: goal_match_train_invoice only
+
+Required to search and book trains!
+```bash
+poetry run python thirdparty/train_api.py
+
+# example url
+# http://localhost:8080/api/search?from=london&to=liverpool&outbound_time=2025-04-18T09:00:00&inbound_time=2025-04-20T09:00:00
+```
+
+### .NET (enterprise) Backend ;)
+> Agent Goal: goal_match_train_invoice only
+
+We have activities written in C# to call the train APIs.
+```bash
+cd enterprise
+dotnet build # ensure you brew install dotnet@8 first!
+dotnet run
+```
+If you're running your train API above on a different host/port then change the API URL in `Program.cs`. Otherwise, be sure to run it using `python thirdparty/train_api.py`.
+
 ## Customizing the Agent
 - `tool_registry.py` contains the mapping of tool names to tool definitions (so the AI understands how to use them)
 - `goal_registry.py` contains descriptions of goals and the tools used to achieve them
 - The tools themselves are defined in their own files in `/tools`
 - Note the mapping in `tools/__init__.py` to each tool
- See main.py where some tool-specific logic is defined (todo, move this to the tool definition)

 ## TODO
- I should prove this out with other tool definitions outside of the event/flight search case (take advantage of my nice DSL).
- Currently hardcoded to the Temporal dev server at localhost:7233. Need to support options incl Temporal Cloud.
 - In a prod setting, I would need to ensure that payload data is stored separately (e.g. in S3 or a noSQL db - the claim-check pattern), or otherwise 'garbage collected'. Without these techniques, long conversations will fill up the workflow's conversation history, and start to breach Temporal event history payload limits.
 - Continue-as-new shouldn't be a big consideration for this use case (as it would take many conversational turns to trigger). Regardless, I should ensure that it's able to carry the agent state over to the new workflow execution.
 - Perhaps the UI should show when the LLM response is being retried (i.e. activity retry attempt because the LLM provided bad output)
- Tests would be nice!
-
-# TODO for this branch
-## Agent
- We'll have to figure out which matches are where. No use going to Manchester for a match that isn't there.
- The use of `###` in prompts I want excluded from the conversation history is a bit of a hack.
-
-## UI
- Possibly need a 'worker down' type of message? I think I already have one when queries fail
-
-## Validator function
- Probably keep data types, but move the activity and workflow code for the demo
- Probably don't need the validator function if its the result from a tool call or confirmation step
+- Tests would be nice!