Polishing before Webinar: Merge pull request #18 from joshmsmith/development

- updates to pyproject.toml to add contributors and update some pytest config
- updates to documentation - clarification cleanup
- defaulting to finserv goals
This commit is contained in:
Joshua Smith
2025-04-24 12:44:58 -04:00
committed by GitHub
9 changed files with 66 additions and 40 deletions

View File

@@ -43,7 +43,7 @@ AGENT_GOAL=goal_choose_agent_type # for multi-goal start
#Choose which category(ies) of goals you want to be listed by the Agent Goal picker if enabled above #Choose which category(ies) of goals you want to be listed by the Agent Goal picker if enabled above
# - options are system (always included), hr, travel, or all. # - options are system (always included), hr, travel, or all.
GOAL_CATEGORIES=hr,travel-flights,travel-trains,fin,ecommerce # default is all GOAL_CATEGORIES=fin # default is all
#GOAL_CATEGORIES=travel-flights #GOAL_CATEGORIES=travel-flights
# Set if the workflow should wait for the user to click a confirm button (and if the UI should show the confirm button and tool args) # Set if the workflow should wait for the user to click a confirm button (and if the UI should show the confirm button and tool args)

View File

@@ -8,19 +8,22 @@ It's really helpful to [watch the demo (5 minute YouTube video)](https://www.you
[![Watch the demo](./assets/agent-youtube-screenshot.jpeg)](https://www.youtube.com/watch?v=GEXllEH2XiQ) [![Watch the demo](./assets/agent-youtube-screenshot.jpeg)](https://www.youtube.com/watch?v=GEXllEH2XiQ)
### Multi-Agent Demo Video
See multi-agent execution in action [here](https://www.youtube.com/watch?v=8Dc_0dC14yY).
## Why Temporal? ## Why Temporal?
There are a lot of AI and Agentic AI tools out there, and more on the way. But why Temporal? Temporal gives this system reliablity, state management, a code-first approach that we really like, built-in observability and easy error handling. There are a lot of AI and Agentic AI tools out there, and more on the way. But why Temporal? Temporal gives this system reliablity, state management, a code-first approach that we really like, built-in observability and easy error handling.
For more, check out [architecture-decisions](./architecture-decisions.md). For more, check out [architecture-decisions](./architecture-decisions.md).
## What is "Agentic AI"? ## What is "Agentic AI"?
These are the key elements of an agentic framework: These are the key elements of an agentic framework:
1. Goals a human can get done, made up of tools that can execute individual steps 1. Goals that a system can accomplish, made up of tools that can execute individual steps
2. The "agent loop" - call LLM, either call tools or prompt human, repeat until goal(s) are done 2. Agent loops - executing an LLM, executing tools, and eliciting input from an external source such as a human: repeat until goal(s) are done
3. Support for tool calls that require human input and approval 3. Support for tool calls that require input and approval
4. Use of an LLM to check human input for relevance before calling the 'real' LLM 4. Use of an LLM to check human input for relevance before calling the 'real' LLM
5. use of an LLM to summarize and compact the conversation history 5. Use of an LLM to summarize and compact the conversation history
6. Prompt construction (made of system prompts, conversation history, and tool metadata - sent to the LLM to create user prompts) 6. Prompt construction made of system prompts, conversation history, and tool metadata - sent to the LLM to create user questions and confirmations
7. Bonus: durable tool execution via Temporal Activities 7. Ideally high durability (done in this system with Temporal Workflow and Activities)
For a deeper dive into this, check out the [architecture guide](./architecture.md). For a deeper dive into this, check out the [architecture guide](./architecture.md).
@@ -35,8 +38,7 @@ See [the architecture guide](./architecture.md).
## Productionalization & Adding Features ## Productionalization & Adding Features
- In a prod setting, I would need to ensure that payload data is stored separately (e.g. in S3 or a noSQL db - the claim-check pattern), or otherwise 'garbage collected'. Without these techniques, long conversations will fill up the workflow's conversation history, and start to breach Temporal event history payload limits. - In a prod setting, I would need to ensure that payload data is stored separately (e.g. in S3 or a noSQL db - the claim-check pattern), or otherwise 'garbage collected'. Without these techniques, long conversations will fill up the workflow's conversation history, and start to breach Temporal event history payload limits.
- A single worker can easily support many workflows - setting workflow ID differently would enable this. - A single worker can easily support many agent workflows (chats) running at the same time. Currently the workflow ID is the same each time, so it will only run one agent at a time. To run multiple agents, you can use a different workflow ID each time (e.g. by using a UUID or timestamp).
- Continue-as-new shouldn't be a big consideration for this use case (as it would take many conversational turns to trigger). Regardless, we should verify that it's able to carry the agent state over to the new workflow execution.
- Perhaps the UI should show when the LLM response is being retried (i.e. activity retry attempt because the LLM provided bad output) - Perhaps the UI should show when the LLM response is being retried (i.e. activity retry attempt because the LLM provided bad output)
- Tests would be nice! [See tests](./tests/). - Tests would be nice! [See tests](./tests/).
@@ -45,7 +47,7 @@ See [the todo](./todo.md) for more details.
See [the guide to adding goals and tools](./adding-goals-and-tools.md) for more ways you can add features. See [the guide to adding goals and tools](./adding-goals-and-tools.md) for more ways you can add features.
## For Temporal SAs ## Enablement Guide (internal resource for Temporal employees)
Check out the [slides](https://docs.google.com/presentation/d/1wUFY4v17vrtv8llreKEBDPLRtZte3FixxBUn0uWy5NU/edit#slide=id.g3333e5deaa9_0_0) here and the enablement guide here (TODO). Check out the [slides](https://docs.google.com/presentation/d/1wUFY4v17vrtv8llreKEBDPLRtZte3FixxBUn0uWy5NU/edit#slide=id.g3333e5deaa9_0_0) here and the [enablement guide](https://docs.google.com/document/d/14E0cEOibUAgHPBqConbWXgPUBY0Oxrnt6_AImdiheW4/edit?tab=t.0#heading=h.ajnq2v3xqbu1).

View File

@@ -1,5 +1,6 @@
# Customizing the Agent # Customizing the Agent
The agent is set up to allow for multiple goals and to switch back to choosing a new goal at the end of every successful goal. A goal is made up of a list of tools that the agent will guide the user through. The agent is set up to have multiple agents, each with their own goal. It supports switching back to choosing a new goal at the end of every successful goal (or even mid-goal).
A goal is made up of a list of tools that the agent will guide the user through.
It may be helpful to review the [architecture](./architecture.md) for a guide and definition of goals, tools, etc. It may be helpful to review the [architecture](./architecture.md) for a guide and definition of goals, tools, etc.

View File

@@ -1,58 +1,65 @@
# Elements # Elements
These are the main elements of this system. These are the main elements of this system. See [architecture decisions](./architecture-decisions.md) for information beind these choices.
![Architecture Elements](./assets/Architecture_elements.png "Architecture Elements") In this document we will explain each element and their interactions, and then connect them all at the end.
<img src="./assets/Architecture_elements.png" width="50%" alt="Architecture Elements">
## Workflow ## Workflow
This is a [Temporal Workflow](https://docs.temporal.io/workflows) - a durable straightforward description of the process to be executed. For our example see [agent_goal_workflow.py](./workflows/agent_goal_workflow.py). This is a [Temporal Workflow](https://docs.temporal.io/workflows) - a durable straightforward description of the process to be executed. See [agent_goal_workflow.py](./workflows/agent_goal_workflow.py).
Temporal is used to make the process scalable, durable, reliable, secure, and visible. Temporal is used to make the process scalable, durable, reliable, secure, and visible.
### Workflow Responsibilities: ### Workflow Responsibilities:
- Orchestrates interactive loop - Orchestrates interactive loops:
- Prompts LLM, Users - LLM Loop: Prompts LLM, durably executes LLM, stores responses
- Interactive Loop: Elicits responses from input (in our case a human) and validates input responses
- Tool Execution Loop: Durably executes Tools
- Keeps record of all interactions ([Signals, Queries, Updates](https://docs.temporal.io/develop/python/message-passing)) - Keeps record of all interactions ([Signals, Queries, Updates](https://docs.temporal.io/develop/python/message-passing))
- Executes LLM durably
- Executes Tools durably
- Handles failures gracefully - Handles failures gracefully
- Human, LLM and tool interaction history stored for debugging and analysis - Input, LLM and Tool interaction history stored for debugging and analysis
## Activities ## Activities
These are [Temporal Activities](https://docs.temporal.io/activities). Defined as simple functions, they are auto-retried async/event driven behind the scenes. Activities durably execute Tools and the LLM. See [a sample activity](./activities/tool_activities.py). These are [Temporal Activities](https://docs.temporal.io/activities). Defined as simple functions, they are auto-retried async/event driven behind the scenes. Activities durably execute Tools and the LLM. See [a sample activity](./activities/tool_activities.py).
## Tools ## Tools
Tools define the capabilities of the system. They are simple Python functions (could be in any language). Tools define the capabilities of the system. They are simple Python functions (could be in any language as Temporal supports multiple languages).
They are executed by Temporal Activities. They are “just code” - can connect to any API or system. They also are where the "hard" business logic is: you can validate and retry actions using code you write. They are executed by Temporal Activities. They are “just code” - can connect to any API or system. They also are where the deterministic business logic is: you can validate and retry actions using code you write.
Failures are handled gracefully by Temporal. Failures are handled gracefully by Temporal.
Activities + Tools turn the probabalistic input from the user and LLM into deterministic action. Activities + Tools turn the probabalistic input from the user and LLM into deterministic action.
## Prompts ## Prompts
Prompts are where the instructions to the LLM & users is. Prompts are made up of initial instructions, goal instructions, and tool instructions. Prompts are where the instructions to the LLM are. Prompts are made up of initial instructions, goal instructions, and tool instructions.
See [agent prompts](./prompts/agent_prompt_generators.py) and [goal & tool prompts](./tools/goal_registry.py). See [agent prompts](./prompts/agent_prompt_generators.py) and [goal & tool prompts](./tools/goal_registry.py).
This is where you can add probabalistic business logic, to control process flow, describe what to do, and give instruction and validation for the LLM. This is where you can add probabalistic business logic to
- to control process flow
- describe what to do
- give examples of interactions
- give instruction and validation for the LLM
## LLM ## LLM
Probabalistic execution: it will _probably_ do what you tell it to do. Probabalistic execution: it will _probably_ do what you tell it to do.
Turns the guidance from the prompts (see [agent prompts](./prompts/agent_prompt_generators.py) and [goal prompts](./tools/goal_registry.py)) into Turns the guidance from the prompts (see [agent prompts](./prompts/agent_prompt_generators.py) and [goal prompts](./tools/goal_registry.py)) into
You have a choice of providers - see [setup](./setup.md). You have a choice of providers - see [setup](./setup.md).
The LLM: The LLM:
- Validates user input for tools - Drives toward the initial Goal and any subsequent Goals selected by user
- Drives toward goal selected by user - Decides what to do based on input, such as:
- Decides when to execute tools - Validates user input for Tools
- Formats input and interprets output for tools - Decides when to execute Tools
- Decides on next step for Goal
- Formats input and interprets output for Tools
- is executed by Temporal Activities - is executed by Temporal Activities
- API failures and logical failures are handled transparently - API failures and logical failures are handled transparently
## Interaction ## Interaction
Interaction is managed with Temporal Signals and Queries. These are durably stored in Workflow History. Interaction is managed with Temporal Signals and Queries. These are durably stored in Workflow History.
Can be used for analysis and debugging. It's all “just code” so it's easy to add new Signals and Queries. History can be used for analysis and debugging. It's all “just code” so it's easy to add new Signals and Queries.
Input can be very dynamic, just needs to be serializable. Input can be very dynamic, just needs to be serializable.
The workflow executes in a loop: gathering input, validating input, executing tools, managing prompts, and then waiting for input. The Workflow executes the Interaction Loop: gathering input, validating input, and providing a response:
![Interaction Loop](./assets/interaction_loop.png) ![Interaction Loop](./assets/interaction_loop.png)
Here's a more detailed example for gathering parameters for tools: Here's a more detailed example for gathering inputs for Tools:
![Tool Gathering](./assets/argument_gathering_cycle.png) ![Tool Gathering](./assets/argument_gathering_cycle.png)
@@ -64,4 +71,4 @@ Now that we have the pieces and what they do, here is a more complete diagram of
# Adding features # Adding features
Want to add more tools, See [adding goals and tools](./adding-goals-and-tools.md). Want to add more Goals and Tools? See [adding goals and tools](./adding-goals-and-tools.md). Have fun!

Binary file not shown.

Before

Width:  |  Height:  |  Size: 39 KiB

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 144 KiB

After

Width:  |  Height:  |  Size: 124 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 125 KiB

After

Width:  |  Height:  |  Size: 129 KiB

View File

@@ -1,9 +1,13 @@
[tool.poetry] [tool.poetry]
name = "temporal_AI_agent" name = "temporal_AI_agent"
version = "0.1.0" version = "0.2.0"
description = "Temporal AI Agent" description = "Temporal AI Agent"
license = "MIT" license = "MIT"
authors = ["Steve Androulakis <steve.androulakis@temporal.io>"] authors = [
"Steve Androulakis <steve.androulakis@temporal.io>",
"Laine Smith <lainecaseysmith@gmail.com>",
"Joshua Smith <josh.smith@temporal.io>"
]
readme = "README.md" readme = "README.md"
# By default, Poetry will find packages automatically, # By default, Poetry will find packages automatically,
@@ -42,8 +46,8 @@ pandas = "^2.2.3"
gtfs-kit = "^10.1.1" gtfs-kit = "^10.1.1"
[tool.poetry.group.dev.dependencies] [tool.poetry.group.dev.dependencies]
pytest = "^7.3" pytest = ">=8.2"
pytest-asyncio = "^0.18.3" pytest-asyncio = "^0.26.0"
black = "^23.7" black = "^23.7"
isort = "^5.12" isort = "^5.12"
@@ -55,4 +59,5 @@ build-backend = "poetry.core.masonry.api"
asyncio_mode = "auto" asyncio_mode = "auto"
log_cli = true log_cli = true
log_cli_level = "INFO" log_cli_level = "INFO"
log_cli_format = "%(asctime)s [%(levelname)8s] %(message)s (%(filename)s:%(lineno)s)" log_cli_format = "%(asctime)s [%(levelname)8s] %(message)s (%(filename)s:%(lineno)s)"
asyncio_default_fixture_loop_scope = "function"

View File

@@ -18,7 +18,7 @@ SHOW_CONFIRM=True
The agent can be configured to pursue different goals using the `AGENT_GOAL` environment variable in your `.env` file. If unset, default is `goal_choose_agent_type`. The agent can be configured to pursue different goals using the `AGENT_GOAL` environment variable in your `.env` file. If unset, default is `goal_choose_agent_type`.
If the first goal is `goal_choose_agent_type` the agent will support multiple goals using goal categories defined by `GOAL_CATEGORIES` in your .env file. If unset, default is all. If the first goal is `goal_choose_agent_type` the agent will support multiple goals using goal categories defined by `GOAL_CATEGORIES` in your .env file. If unset, default is all. We recommend starting with `fin`.
```bash ```bash
GOAL_CATEGORIES=hr,travel-flights,travel-trains,fin GOAL_CATEGORIES=hr,travel-flights,travel-trains,fin
``` ```
@@ -206,7 +206,6 @@ By default it will _not_ make a real workflow, it'll just fake it. If you get th
FIN_START_REAL_WORKFLOW=FALSE #set this to true to start a real workflow FIN_START_REAL_WORKFLOW=FALSE #set this to true to start a real workflow
``` ```
#### Goals: HR/PTO #### Goals: HR/PTO
Make sure you have the mock users you want in (such as yourself) in [the PTO mock data file](./tools/data/employee_pto_data.json). Make sure you have the mock users you want in (such as yourself) in [the PTO mock data file](./tools/data/employee_pto_data.json).
@@ -220,4 +219,16 @@ Make sure you have the mock orders you want in (such as those with real tracking
- The tools themselves are defined in their own files in `/tools` - The tools themselves are defined in their own files in `/tools`
- Note the mapping in `tools/__init__.py` to each tool - Note the mapping in `tools/__init__.py` to each tool
For more details, check out [adding goals and tools guide](./adding-goals-and-tools.md). For more details, check out [adding goals and tools guide](./adding-goals-and-tools.md).
## Setup Checklist
[ ] copy `.env.example` to `.env` <br />
[ ] Select an LLM and add your API key to `.env` <br />
[ ] (Optional) set your starting goal and goal category in `.env` <br />
[ ] (Optional) configure your Temporal Cloud settings in `.env` <br />
[ ] `poetry run python scripts/run_worker.py` <br />
[ ] `poetry run uvicorn api.main:app --reload` <br />
[ ] `cd frontend`, `npm install`, `npx vite` <br />
[ ] Access the UI at `http://localhost:5173` <br />
And that's it! Happy AI Agent Exploring!