diff --git a/README.md b/README.md index fd070f2..236f468 100644 --- a/README.md +++ b/README.md @@ -2,191 +2,29 @@ This demo shows a multi-turn conversation with an AI agent running inside a Temporal workflow. The purpose of the agent is to collect information towards a goal, running tools along the way. There's a simple DSL input for collecting information (currently set up to use mock functions to search for public events, search for flights around those events, then create a test Stripe invoice for the trip). -The AI will respond with clarifications and ask for any missing information to that goal. You can configure it to use [ChatGPT 4o](https://openai.com/index/hello-gpt-4o/), [Anthropic Claude](https://www.anthropic.com/claude), [Google Gemini](https://gemini.google.com), [Deepseek-V3](https://www.deepseek.com/) or a local LLM of your choice using [Ollama](https://ollama.com). +The AI will respond with clarifications and ask for any missing information to that goal. You can configure it to use [ChatGPT 4o](https://openai.com/index/hello-gpt-4o/), [Anthropic Claude](https://www.anthropic.com/claude), [Google Gemini](https://gemini.google.com), [Deepseek-V3](https://www.deepseek.com/), [Grok](https://docs.x.ai/docs/overview) or a local LLM of your choice using [Ollama](https://ollama.com). -[Watch the demo (5 minute YouTube video)](https://www.youtube.com/watch?v=GEXllEH2XiQ) +It's really helpful to [watch the demo (5 minute YouTube video)](https://www.youtube.com/watch?v=GEXllEH2XiQ) to understand how interaction works. -[![Watch the demo](./agent-youtube-screenshot.jpeg)](https://www.youtube.com/watch?v=GEXllEH2XiQ) +[![Watch the demo](./assets/agent-youtube-screenshot.jpeg)](https://www.youtube.com/watch?v=GEXllEH2XiQ) -## Configuration +## Setup and Configuration +See [the Setup guide](./setup.md). -This application uses `.env` files for configuration. Copy the [.env.example](.env.example) file to `.env` and update the values: +## Interaction +TODO -```bash -cp .env.example .env -``` +## Architecture +See [the architecture guide](./architecture.md). -### Agent Goal Configuration - -The agent can be configured to pursue different goals using the `AGENT_GOAL` environment variable in your `.env` file. - -#### Goal: Find an event in Australia / New Zealand, book flights to it and invoice the user for the cost -- `AGENT_GOAL=goal_event_flight_invoice` (default) - Helps users find events, book flights, and arrange train travel with invoice generation - - This is the scenario in the video above - -#### Goal: Find a Premier League match, book train tickets to it and invoice the user for the cost -- `AGENT_GOAL=goal_match_train_invoice` - Focuses on Premier League match attendance with train booking and invoice generation - - This is a new goal that is part of an upcoming conference talk - -If not specified, the agent defaults to `goal_event_flight_invoice`. Each goal comes with its own set of tools and conversation flows designed for specific use cases. You can examine `tools/goal_registry.py` to see the detailed configuration of each goal. - -See the next section for tool configuration for each goal. - -### Tool Configuration - -#### Agent Goal: goal_event_flight_invoice (default) -* The agent uses a mock function to search for events. This has zero configuration. -* By default the agent uses a mock function to search for flights. - * If you want to use the real flights API, go to `tools/search_flights.py` and replace the `search_flights` function with `search_flights_real_api` that exists in the same file. - * It's free to sign up at [RapidAPI](https://rapidapi.com/apiheya/api/sky-scrapper) - * This api might be slow to respond, so you may want to increase the start to close timeout, `TOOL_ACTIVITY_START_TO_CLOSE_TIMEOUT` in `workflows/workflow_helpers.py` -* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in .env - * It's free to sign up and get a key at [Stripe](https://stripe.com/) - * If you're lazy go to `tools/create_invoice.py` and replace the `create_invoice` function with the mock `create_invoice_example` that exists in the same file. - -#### Agent Goal: goal_match_train_invoice - -* Finding a match requires a key from [Football Data](https://www.football-data.org). Sign up for a free account, then see the 'My Account' page to get your API token. Set `FOOTBALL_DATA_API_KEY` to this value. - * If you're lazy go to `tools/search_fixtures.py` and replace the `search_fixtures` function with the mock `search_fixtures_example` that exists in the same file. -* We use a mock function to search for trains. Start the train API server to use the real API: `python thirdparty/train_api.py` -* * The train activity is 'enterprise' so it's written in C# and requires a .NET runtime. See the [.NET backend](#net-(enterprise)-backend) section for details on running it. -* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in .env - * It's free to sign up and get a key at [Stripe](https://stripe.com/) - * If you're lazy go to `tools/create_invoice.py` and replace the `create_invoice` function with the mock `create_invoice_example` that exists in the same file. - -### LLM Provider Configuration - -The agent can use OpenAI's GPT-4o, Google Gemini, Anthropic Claude, or a local LLM via Ollama. Set the `LLM_PROVIDER` environment variable in your `.env` file to choose the desired provider: - -- `LLM_PROVIDER=openai` for OpenAI's GPT-4o -- `LLM_PROVIDER=google` for Google Gemini -- `LLM_PROVIDER=anthropic` for Anthropic Claude -- `LLM_PROVIDER=deepseek` for DeepSeek-V3 -- `LLM_PROVIDER=ollama` for running LLMs via [Ollama](https://ollama.ai) (not recommended for this use case) - -### Option 1: OpenAI - -If using OpenAI, ensure you have an OpenAI key for the GPT-4o model. Set this in the `OPENAI_API_KEY` environment variable in `.env`. - -### Option 2: Google Gemini - -To use Google Gemini: - -1. Obtain a Google API key and set it in the `GOOGLE_API_KEY` environment variable in `.env`. -2. Set `LLM_PROVIDER=google` in your `.env` file. - -### Option 3: Anthropic Claude (recommended) - -I find that Claude Sonnet 3.5 performs better than the other hosted LLMs for this use case. - -To use Anthropic: - -1. Obtain an Anthropic API key and set it in the `ANTHROPIC_API_KEY` environment variable in `.env`. -2. Set `LLM_PROVIDER=anthropic` in your `.env` file. - -### Option 4: Deepseek-V3 - -To use Deepseek-V3: - -1. Obtain a Deepseek API key and set it in the `DEEPSEEK_API_KEY` environment variable in `.env`. -2. Set `LLM_PROVIDER=deepseek` in your `.env` file. - -### Option 5: Local LLM via Ollama (not recommended) - -To use a local LLM with Ollama: - -1. Install [Ollama](https://ollama.com) and the [Qwen2.5 14B](https://ollama.com/library/qwen2.5) model. - - Run `ollama run ` to start the model. Note that this model is about 9GB to download. - - Example: `ollama run qwen2.5:14b` - -2. Set `LLM_PROVIDER=ollama` in your `.env` file and `OLLAMA_MODEL_NAME` to the name of the model you installed. - -Note: I found the other (hosted) LLMs to be MUCH more reliable for this use case. However, you can switch to Ollama if desired, and choose a suitably large model if your computer has the resources. - -## Configuring Temporal Connection - -By default, this application will connect to a local Temporal server (`localhost:7233`) in the default namespace, using the `agent-task-queue` task queue. You can override these settings in your `.env` file. - -### Use Temporal Cloud - -See [.env.example](.env.example) for details on connecting to Temporal Cloud using mTLS or API key authentication. - -[Sign up for Temporal Cloud](https://temporal.io/get-cloud) - -### Use a local Temporal Dev Server - -On a Mac -```bash -brew install temporal -temporal server start-dev -``` -See the [Temporal documentation](https://learn.temporal.io/getting_started/python/dev_environment/) for other platforms. - - -## Running the Application - -### Python Backend - -Requires [Poetry](https://python-poetry.org/) to manage dependencies. - -1. `python -m venv venv` - -2. `source venv/bin/activate` - -3. `poetry install` - -Run the following commands in separate terminal windows: - -1. Start the Temporal worker: -```bash -poetry run python scripts/run_worker.py -``` - -2. Start the API server: -```bash -poetry run uvicorn api.main:app --reload -``` -Access the API at `/docs` to see the available endpoints. - -### React UI -Start the frontend: -```bash -cd frontend -npm install -npx vite -``` -Access the UI at `http://localhost:5173` - -### Python Search Trains API -> Agent Goal: goal_match_train_invoice only - -Required to search and book trains! -```bash -poetry run python thirdparty/train_api.py - -# example url -# http://localhost:8080/api/search?from=london&to=liverpool&outbound_time=2025-04-18T09:00:00&inbound_time=2025-04-20T09:00:00 -``` - -### .NET (enterprise) Backend ;) -> Agent Goal: goal_match_train_invoice only - -We have activities written in C# to call the train APIs. -```bash -cd enterprise -dotnet build # ensure you brew install dotnet@8 first! -dotnet run -``` -If you're running your train API above on a different host/port then change the API URL in `Program.cs`. Otherwise, be sure to run it using `python thirdparty/train_api.py`. - -## Customizing the Agent -- `tool_registry.py` contains the mapping of tool names to tool definitions (so the AI understands how to use them) -- `goal_registry.py` contains descriptions of goals and the tools used to achieve them -- The tools themselves are defined in their own files in `/tools` -- Note the mapping in `tools/__init__.py` to each tool - -## TODO +## Productionalization & Adding Features - In a prod setting, I would need to ensure that payload data is stored separately (e.g. in S3 or a noSQL db - the claim-check pattern), or otherwise 'garbage collected'. Without these techniques, long conversations will fill up the workflow's conversation history, and start to breach Temporal event history payload limits. - Continue-as-new shouldn't be a big consideration for this use case (as it would take many conversational turns to trigger). Regardless, I should ensure that it's able to carry the agent state over to the new workflow execution. - Perhaps the UI should show when the LLM response is being retried (i.e. activity retry attempt because the LLM provided bad output) -- Tests would be nice! \ No newline at end of file +- Tests would be nice! +See [the todo](./todo.md) for more details. + +See Customization for more details. <-- TODO + +## For Temporal SAs +Check out the [slides](https://docs.google.com/presentation/d/1wUFY4v17vrtv8llreKEBDPLRtZte3FixxBUn0uWy5NU/edit#slide=id.g3333e5deaa9_0_0) here and the enablement guide here (TODO). diff --git a/architecture.md b/architecture.md new file mode 100644 index 0000000..f1a5d7b --- /dev/null +++ b/architecture.md @@ -0,0 +1,12 @@ +# Elements +![Architecture Elements](./assets/Architecture_elements.png "Architecture Elements") + +talk through the pieces + +# Architecture Model +![Architecture](./assets/ai_agent_architecture_model.png "Architecture Model") + +explain elements + +# Adding features +link to how to LLM interactions/how to change \ No newline at end of file diff --git a/assets/Architecture_elements.png b/assets/Architecture_elements.png new file mode 100644 index 0000000..a1f7b61 Binary files /dev/null and b/assets/Architecture_elements.png differ diff --git a/agent-youtube-screenshot.jpeg b/assets/agent-youtube-screenshot.jpeg similarity index 100% rename from agent-youtube-screenshot.jpeg rename to assets/agent-youtube-screenshot.jpeg diff --git a/assets/ai_agent_architecture_model.png b/assets/ai_agent_architecture_model.png new file mode 100644 index 0000000..e38f19b Binary files /dev/null and b/assets/ai_agent_architecture_model.png differ diff --git a/setup.md b/setup.md new file mode 100644 index 0000000..5f20618 --- /dev/null +++ b/setup.md @@ -0,0 +1,176 @@ +## Configuration + +This application uses `.env` files for configuration. Copy the [.env.example](.env.example) file to `.env` and update the values: + +```bash +cp .env.example .env +``` + +### Agent Goal Configuration + +The agent can be configured to pursue different goals using the `AGENT_GOAL` environment variable in your `.env` file. + +#### Goal: Find an event in Australia / New Zealand, book flights to it and invoice the user for the cost +- `AGENT_GOAL=goal_event_flight_invoice` (default) - Helps users find events, book flights, and arrange train travel with invoice generation + - This is the scenario in the video above + +#### Goal: Find a Premier League match, book train tickets to it and invoice the user for the cost +- `AGENT_GOAL=goal_match_train_invoice` - Focuses on Premier League match attendance with train booking and invoice generation + - This is a new goal that is part of an upcoming conference talk + +If not specified, the agent defaults to `goal_event_flight_invoice`. Each goal comes with its own set of tools and conversation flows designed for specific use cases. You can examine `tools/goal_registry.py` to see the detailed configuration of each goal. + +See the next section for tool configuration for each goal. + +### Tool Configuration + +#### Agent Goal: goal_event_flight_invoice (default) +* The agent uses a mock function to search for events. This has zero configuration. +* By default the agent uses a mock function to search for flights. + * If you want to use the real flights API, go to `tools/search_flights.py` and replace the `search_flights` function with `search_flights_real_api` that exists in the same file. + * It's free to sign up at [RapidAPI](https://rapidapi.com/apiheya/api/sky-scrapper) + * This api might be slow to respond, so you may want to increase the start to close timeout, `TOOL_ACTIVITY_START_TO_CLOSE_TIMEOUT` in `workflows/workflow_helpers.py` +* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in .env + * It's free to sign up and get a key at [Stripe](https://stripe.com/) + * If you're lazy go to `tools/create_invoice.py` and replace the `create_invoice` function with the mock `create_invoice_example` that exists in the same file. + +#### Agent Goal: goal_match_train_invoice + +* Finding a match requires a key from [Football Data](https://www.football-data.org). Sign up for a free account, then see the 'My Account' page to get your API token. Set `FOOTBALL_DATA_API_KEY` to this value. + * If you're lazy go to `tools/search_fixtures.py` and replace the `search_fixtures` function with the mock `search_fixtures_example` that exists in the same file. +* We use a mock function to search for trains. Start the train API server to use the real API: `python thirdparty/train_api.py` +* * The train activity is 'enterprise' so it's written in C# and requires a .NET runtime. See the [.NET backend](#net-(enterprise)-backend) section for details on running it. +* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in .env + * It's free to sign up and get a key at [Stripe](https://stripe.com/) + * If you're lazy go to `tools/create_invoice.py` and replace the `create_invoice` function with the mock `create_invoice_example` that exists in the same file. + +### LLM Provider Configuration + +The agent can use OpenAI's GPT-4o, Google Gemini, Anthropic Claude, or a local LLM via Ollama. Set the `LLM_PROVIDER` environment variable in your `.env` file to choose the desired provider: + +- `LLM_PROVIDER=openai` for OpenAI's GPT-4o +- `LLM_PROVIDER=google` for Google Gemini +- `LLM_PROVIDER=anthropic` for Anthropic Claude +- `LLM_PROVIDER=deepseek` for DeepSeek-V3 +- `LLM_PROVIDER=ollama` for running LLMs via [Ollama](https://ollama.ai) (not recommended for this use case) + +### Option 1: OpenAI + +If using OpenAI, ensure you have an OpenAI key for the GPT-4o model. Set this in the `OPENAI_API_KEY` environment variable in `.env`. + +### Option 2: Google Gemini + +To use Google Gemini: + +1. Obtain a Google API key and set it in the `GOOGLE_API_KEY` environment variable in `.env`. +2. Set `LLM_PROVIDER=google` in your `.env` file. + +### Option 3: Anthropic Claude (recommended) + +I find that Claude Sonnet 3.5 performs better than the other hosted LLMs for this use case. + +To use Anthropic: + +1. Obtain an Anthropic API key and set it in the `ANTHROPIC_API_KEY` environment variable in `.env`. +2. Set `LLM_PROVIDER=anthropic` in your `.env` file. + +### Option 4: Deepseek-V3 + +To use Deepseek-V3: + +1. Obtain a Deepseek API key and set it in the `DEEPSEEK_API_KEY` environment variable in `.env`. +2. Set `LLM_PROVIDER=deepseek` in your `.env` file. + +### Option 5: Local LLM via Ollama (not recommended) + +To use a local LLM with Ollama: + +1. Install [Ollama](https://ollama.com) and the [Qwen2.5 14B](https://ollama.com/library/qwen2.5) model. + - Run `ollama run ` to start the model. Note that this model is about 9GB to download. + - Example: `ollama run qwen2.5:14b` + +2. Set `LLM_PROVIDER=ollama` in your `.env` file and `OLLAMA_MODEL_NAME` to the name of the model you installed. + +Note: I found the other (hosted) LLMs to be MUCH more reliable for this use case. However, you can switch to Ollama if desired, and choose a suitably large model if your computer has the resources. + +## Configuring Temporal Connection + +By default, this application will connect to a local Temporal server (`localhost:7233`) in the default namespace, using the `agent-task-queue` task queue. You can override these settings in your `.env` file. + +### Use Temporal Cloud + +See [.env.example](.env.example) for details on connecting to Temporal Cloud using mTLS or API key authentication. + +[Sign up for Temporal Cloud](https://temporal.io/get-cloud) + +### Use a local Temporal Dev Server + +On a Mac +```bash +brew install temporal +temporal server start-dev +``` +See the [Temporal documentation](https://learn.temporal.io/getting_started/python/dev_environment/) for other platforms. + + +## Running the Application + +### Python Backend + +Requires [Poetry](https://python-poetry.org/) to manage dependencies. + +1. `python -m venv venv` + +2. `source venv/bin/activate` + +3. `poetry install` + +Run the following commands in separate terminal windows: + +1. Start the Temporal worker: +```bash +poetry run python scripts/run_worker.py +``` + +2. Start the API server: +```bash +poetry run uvicorn api.main:app --reload +``` +Access the API at `/docs` to see the available endpoints. + +### React UI +Start the frontend: +```bash +cd frontend +npm install +npx vite +``` +Access the UI at `http://localhost:5173` + +### Python Search Trains API +> Agent Goal: goal_match_train_invoice only + +Required to search and book trains! +```bash +poetry run python thirdparty/train_api.py + +# example url +# http://localhost:8080/api/search?from=london&to=liverpool&outbound_time=2025-04-18T09:00:00&inbound_time=2025-04-20T09:00:00 +``` + +### .NET (enterprise) Backend ;) +> Agent Goal: goal_match_train_invoice only + +We have activities written in C# to call the train APIs. +```bash +cd enterprise +dotnet build # ensure you brew install dotnet@8 first! +dotnet run +``` +If you're running your train API above on a different host/port then change the API URL in `Program.cs`. Otherwise, be sure to run it using `python thirdparty/train_api.py`. + +## Customizing the Agent +- `tool_registry.py` contains the mapping of tool names to tool definitions (so the AI understands how to use them) +- `goal_registry.py` contains descriptions of goals and the tools used to achieve them +- The tools themselves are defined in their own files in `/tools` +- Note the mapping in `tools/__init__.py` to each tool \ No newline at end of file diff --git a/todo.md b/todo.md index a63530a..46b59cb 100644 --- a/todo.md +++ b/todo.md @@ -1,36 +1,43 @@ # todo list -[x] multi-goal
- [x] set goal to list agents when done
- [x] make this better/smoother
- [ ] clean up workflow/make functions [ ] make the debugging confirms optional
-[ ] grok integration
-[ ] document *why* temporal for ai agents - scalability, durability in the readme
+
+[ ] document *why* temporal for ai agents - scalability, durability, visibility in the readme
[ ] fix readme: move setup to its own page, demo to its own page, add the why /|\ section
[ ] add architecture to readme
+- elements of app
+- dive into llm interaction
+- workflow breakdown - interactive loop
+- why temporal
+ +[ ] setup readme, why readme, architecture readme, what this is in main readme with temporal value props and pictures
+[ ] how to add more scenarios, tools
+
+
[ ] create tests
[ ] create people management scenario
- -- check pay status - -- book work travel - -- check PTO levels - -- check insurance coverages - -- book PTO around a date (https://developers.google.com/calendar/api/guides/overview)? - -- scenario should use multiple tools - -- expense management - -- check in on the health of the team -[ ] demo the reasons why: - -- Orchestrate interactions across distributed data stores and tools - -- Hold state, potentially over long periods of time - -- Ability to ‘self-heal’ and retry until the (probabilistic) LLM returns valid data - -- Support for human intervention such as approvals - -- Parallel processing for efficiency of data retrieval and tool use - -- Insight into the agent’s performance +- check pay status
+- book work travel
+- check PTO levels
+- check insurance coverages
+- book PTO around a date (https://developers.google.com/calendar/api/guides/overview)?
+- scenario should use multiple tools
+- expense management
+- check in on the health of the team
+ +[ ] demo the reasons why:
+- Orchestrate interactions across distributed data stores and tools
+- Hold state, potentially over long periods of time
+- Ability to ‘self-heal’ and retry until the (probabilistic) LLM returns valid data
+- Support for human intervention such as approvals
+- Parallel processing for efficiency of data retrieval and tool use
+- Insight into the agent’s performance
+ - ask the ai agent how it did at the end of the conversation, was it efficient? successful? insert a search attribute to document that before return [ ] customize prompts in [workflow to manage scenario](./workflows/tool_workflow.py)
[ ] add in new tools?
-[ ] non-retry the api key error - "Invalid API Key provided: sk_test_**J..." and "AuthenticationError" -[ ] make it so you can yeet yourself out of a goal and pick a new one +[ ] non-retry the api key error - "Invalid API Key provided: sk_test_**J..." and "AuthenticationError"
+[ ] make it so you can yeet yourself out of a goal and pick a new one