176 Commits

Author SHA1 Message Date
Mason Egger
c2785b79f4 Update API to use proper query (#44)
Looks like you changed the query handler in the Workflow but didn't update the API. Minor patch.
2025-06-16 10:28:05 -07:00
Steve Androulakis
861e55a8d0 Mcp enhancements (#43)
* reuses MCP connections in each worker for efficiency

* you can see your food

* you can see your food

* prompt eng around images
2025-06-16 08:37:32 -07:00
Steve Androulakis
49dd00ec3b removed defunct goal registry file 2025-06-13 09:09:01 -07:00
Steve Androulakis
5d55a9fe80 Model Context Protocol (MCP) support with new use case (#42)
* initial mcp

* food ordering with mcp

* prompt eng

* splitting out goals and updating docs

* a diff so I can get tests from codex

* a diff so I can get tests from codex

* oops, missing files

* tests, file formatting

* readme and setup updates

* setup.md link fixes

* readme change

* readme change

* readme change

* stripe food setup script

* single agent mode default

* prompt engineering for better multi agent performance

* performance should be greatly improved

* Update goals/finance.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update activities/tool_activities.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* co-pilot PR suggested this change, and now fixed it

* stronger wording around json format response

* formatting

* moved docs to dir

* moved image assets under docs

* cleanup env example, stripe guidance

* cleanup

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-09 16:39:57 -07:00
Steve Androulakis
1811e4cf59 README for contribs 2025-06-01 09:00:11 -07:00
Steve Androulakis
157c337d23 README for contribs 2025-06-01 08:59:52 -07:00
Steve Androulakis
e52ddd3e5e straggler linted file 2025-06-01 08:58:19 -07:00
Steve Androulakis
eb06cf5c8d Enhance Dev Experience and Code Quality (#41)
* Format codebase to satisfy linters

* fixing pylance and ruff-checked files

* contributing md, and type and formatting fixes

* setup file capitalization

* test fix
2025-06-01 08:54:59 -07:00
Steve Androulakis
e35181b5ad Temporal tests (#40)
* temporal tests

* codex setup env script to readme
2025-05-29 12:56:58 -07:00
Ka Wo Fong
f7ef2b1c7e fix(setup): add stripe to dep (#39) 2025-05-29 08:42:22 -07:00
Steve Androulakis
71e54b9ecd todo list (#38)
* Update todo.md
2025-05-29 08:26:16 -07:00
Steve Androulakis
a7a2002217 Update setup.md 2025-05-27 11:02:41 -07:00
Steve Androulakis
5a3bfbd848 mock football data if no key (#37) 2025-05-27 10:47:50 -07:00
Steve Androulakis
7bb6688797 Jonymusky litellm integration (#36)
* feat: LiteLLM integration

* update

* chore: make start-dev

feedback from: https://github.com/temporal-community/temporal-ai-agent/issues/31

* bump dependencies

* clean up setup.md

* setup update

---------

Co-authored-by: Jonathan Muszkat <muskys@gmail.com>
2025-05-26 14:37:14 -07:00
Steve Androulakis
847f4bbaef Review dallastexas92 nostripekey (#35)
* Update setup.md

Detail that the stripe key must be commented out in order to create a dummy invoice

* Update create_invoice.py

Remove the example invoice function as the 'else' statement already captures this

* Update setup.md

Edited verbiage for the create invoice explanation

* cover empty stripe api env

---------

Co-authored-by: Dallas Young <33672687+dallastexas92@users.noreply.github.com>
2025-05-26 14:13:59 -07:00
Steve Androulakis
f8e0dd3b2a Docker setup (#34)
* Add Docker for better DX from Znack's PR

* setup readme

---------

Co-authored-by: znack <scher56@gmail.com>
2025-05-26 14:02:22 -07:00
Mason Egger
2f3afd6954 relocking Poetry lock file to align with pyproject.toml (#32)
Approved, thanks Mason!
2025-05-08 11:12:38 -07:00
Steve Androulakis
edb7df5b3c 0.2.0 changelog 2025-04-28 09:16:34 -07:00
Steve Androulakis
05041f9433 prompt eng on money movement args 2025-04-25 07:25:29 -07:00
Steve Androulakis
0767533ca1 prompt goal fix: not money order 2025-04-25 07:17:17 -07:00
Steve Androulakis
611a6b6238 Merge pull request #29 from temporal-community/josh-multiagent-steve-fixes
Josh multiagent capability, Steve fixes. Original PR: https://github.com/temporal-community/temporal-ai-agent/pull/25
2025-04-24 20:28:37 -07:00
Steve Androulakis
7850af6be2 prompt engineering 2025-04-24 20:25:02 -07:00
Steve Androulakis
a29b100017 mucho fixes and prompt engineering 2025-04-24 20:16:24 -07:00
Steve Androulakis
a5fc6ad477 Josh multi-agent changes
Merge remote-tracking branch 'josh/main' into josh-merge-branch
2025-04-24 18:54:18 -07:00
Joshua Smith
c9ae8b29bf Polishing before Webinar: Merge pull request #18 from joshmsmith/development
- updates to pyproject.toml to add contributors and update some pytest config
- updates to documentation - clarification cleanup
- defaulting to finserv goals
2025-04-24 12:44:58 -04:00
Joshua Smith
7ef10e2481 - updates to pyproject.toml to add contributors and update some pytest config
- updates to documentation - clarification cleanup
- defaulting to finserv goals
2025-04-24 12:37:16 -04:00
Joshua Smith
183f834635 Merge pull request #17 from joshmsmith/development
testing complete
2025-04-23 11:21:15 -04:00
Joshua Smith
47c4b99f2c adjusting loan sample conversation 2025-04-22 15:16:33 -04:00
Joshua Smith
823208db3c - adding Steve's updated confirm box UI
- goal prompts and agent changes to smooth out that interaction and remove listagents duplication
adding extra confirmation for money movement tool
2025-04-22 12:22:42 -04:00
Steve Androulakis
b4aa929451 confirm box is pretty now 2025-04-21 09:00:06 -07:00
Joshua Smith
b4d57cfad6 Merge main changes into development 2025-04-21 09:56:58 -04:00
Joshua Smith
2a1624f621 back to claud 3.5
prompts and adding list-agents automatically in goal registry
fixing some finserv tool args
2025-04-21 09:55:45 -04:00
Joshua Smith
326a5a5beb Merge pull request #16 from joshmsmith/ecommerce
Fixes coming in delivered by the ecommerce branch:

new ecommerce scenarios
fixes for multi-goal:post first real goal goal switch: duplicate listagents behavior from the toolplanner
adding ecommerce initial guidance
fixed new-goal guidance prompts for multi-goal mode
(minor) fixed abug in money movement so it won't connect to temporal cloud if it's not doing a real workflow
(minor) fixed abug in loan application so it won't connect to temporal cloud if it's not doing a real workflow
some todo notes cleanup
2025-04-18 17:13:03 -04:00
Joshua Smith
163477c066 Merge branch 'main' into ecommerce 2025-04-18 17:11:26 -04:00
Joshua Smith
d48dafcaa5 - fixes for multi-goal:post first real goal goal switch: duplicate listagents behavior from the toolplanner
- adding ecommerce initial guidance
- fixed new-goal guidance prompts for multi-goal mode
- (minor) fixed abug in money movement so it won't connect to temporal cloud if it's not doing a real workflow
- (minor) fixed abug in loan application so it won't connect to temporal cloud if it's not doing a real workflow
- some todo notes cleanup
2025-04-18 17:08:44 -04:00
Joshua Smith
32e856e494 fixing second goal selection loop with prompts 2025-04-17 15:20:17 -04:00
Joshua Smith
a07af3e7bf Merge pull request #15 from joshmsmith/financial-services-demo-scenarios
merging commits/no change
2025-04-17 11:34:39 -04:00
Joshua Smith
fc07315358 Merge pull request #14 from joshmsmith/development
Development
2025-04-17 11:34:01 -04:00
Joshua Smith
ebb12feafa Merge pull request #13 from joshmsmith/main
sync development with changes from main
2025-04-17 10:58:19 -04:00
Joshua Smith
7b2c7cef8f Merge pull request #12 from joshmsmith/main
fixing some post-merge bugs
2025-04-17 10:56:14 -04:00
Laine
4410f30642 Default to dummy data for tracking package 2025-04-17 10:09:58 -04:00
Laine
77942b19a1 Merge branch 'ecommerce' of https://github.com/joshmsmith/temporal-ai-agent into ecommerce 2025-04-17 09:19:33 -04:00
Laine
247bac0a28 Try to fix multiple ListAgents steps 2025-04-17 09:19:29 -04:00
Joshua Smith
cf55f0eaee fixing some post-merge bugs 2025-04-17 08:59:15 -04:00
Joshua Smith
0c678a120a Merge pull request #11 from joshmsmith/main
pushing changes from main to finserv branch
2025-04-17 08:54:38 -04:00
Joshua Smith
6383614076 Merge branch 'financial-services-demo-scenarios' into main 2025-04-17 08:54:24 -04:00
Joshua Smith
92bce11151 Merge pull request #10 from joshmsmith/main
pushing main changes down to ecommerce
2025-04-17 08:52:32 -04:00
Joshua Smith
e5405907fa Merge branch 'ecommerce' into main 2025-04-17 08:52:17 -04:00
Joshua Smith
dddf7f4ccd Merge pull request #9 from joshmsmith/main
pushing main changes down to development
2025-04-17 08:48:26 -04:00
Joshua Smith
6f245a1998 Merge branch 'development' into main 2025-04-17 08:48:15 -04:00
Joshua Smith
86a6dfe991 renaming signal from confirmed to confirm 2025-04-17 06:00:17 -04:00
Joshua Smith
83c6a2454d 1. These aren't the tests you're looking for
2. fixing confirmed signal for now
2025-04-17 05:57:55 -04:00
Joshua Smith
463ae581ac adding .env* stuff to gitignore, fixing a minor docs formatting bug 2025-04-17 05:32:55 -04:00
Joshua Smith
e62b105872 adding to gitignore 2025-04-17 05:23:17 -04:00
Joshua Smith
7b52b8a817 adding to todo, gitignore 2025-04-16 16:26:37 -04:00
Joshua Smith
6f9079ba12 updates to todo 2025-04-16 16:23:31 -04:00
Joshua Smith
ac44d35acb changes to .gitignore 2025-04-15 16:46:09 -04:00
Joshua Smith
0bcf68d9fd Merge pull request #8 from joshmsmith/financial-services-demo-scenarios
Financial services demo scenarios
2025-04-15 16:40:40 -04:00
Joshua Smith
e96c5e068d Merge branch 'temporal-community:main' into financial-services-demo-scenarios 2025-04-15 16:34:42 -04:00
Joshua Smith
50463170d2 Merge branch 'temporal-community:main' into ecommerce 2025-04-15 16:34:32 -04:00
Joshua Smith
13d0085882 Merge branch 'temporal-community:main' into development 2025-04-15 16:34:23 -04:00
Joshua Smith
17f33a5094 Merge pull request #7 from temporal-community/main
.NET - Added logger factory to worker and logging to activities (#26)
2025-04-15 16:33:55 -04:00
Joshua Smith
e92e3f43c9 changes to make the project more python compliant and fix the bug tracker URL 2025-04-15 11:01:18 -04:00
Joshua Smith
4f953132e0 minor todo updates 2025-04-14 11:42:40 -04:00
Joshua Smith
812e295f3a switching to method activity calls 2025-04-14 09:27:33 -04:00
Joshua Smith
dddd2977b9 adding ecommerce to .env.example and some error handling in track_package 2025-04-13 17:03:50 -04:00
Joshua Smith
79dcd40dde well it kinda works 2025-04-12 15:41:47 -04:00
Joshua Smith
f0524f2b5f yeah this won't work 2025-04-11 17:45:46 -04:00
Joshua Smith
5b58f30e0d wip checkin 2025-04-11 17:43:34 -04:00
Laine
4e0ff0e535 Rename get_order_status to get_order, add ecommerce list orders goal 2025-04-11 17:14:10 -04:00
Joshua Smith
585791e826 todo updates 2025-04-11 09:36:46 -04:00
Keith Tenzer
2b2a5522e9 Added logger factory to worker and logging to activities (#26)
Signed-off-by: Keith Tenzer <ktenzer@keiths-mbp.lan>
Co-authored-by: Keith Tenzer <ktenzer@keiths-mbp.lan>
2025-04-10 14:18:16 -07:00
Laine
aba934e8b4 Initial add of ecommercie - order status goal and tools 2025-04-10 16:57:15 -04:00
Joshua Smith
2539436a79 adding testing config 2025-04-10 10:26:55 -04:00
Joshua Smith
6517b624ee Merge pull request #6 from joshmsmith/main
syncing dev
2025-04-10 09:48:29 -04:00
Joshua Smith
1e22f3ee4c changes to gitignore 2025-04-10 09:41:14 -04:00
Joshua Smith
ef45ca0451 work on tests 2025-04-10 09:38:13 -04:00
Joshua Smith
c18a40b502 - dynamic agent prompt based on multi goal or not
- made choose_agent_goal be dynamically included
- made tool selection not be required in all toolchains
- changes to get env vars easier in workflow
- Updated docs/guides, todo based on aboe
2025-04-08 15:01:11 -04:00
Joshua Smith
f567583b3a todo updates 2025-04-03 15:57:35 -04:00
Joshua Smith
87b5699dc1 documentation & guidance updates, getting things done, fixing a possible NDE if you change env vars, changes to enable user picking "done", minor test changes, minor goal selection prompt improvements 2025-04-03 15:54:44 -04:00
Joshua Smith
40bd76e80f changes to be specific about travel scenarios, setup guidance about goal_categories, fixed a bug about llm selection in tool_activities.py, better comments 2025-03-28 15:33:56 -04:00
Joshua Smith
86da6a1c74 updating docs about goal changing when adding new goals 2025-03-28 15:01:54 -04:00
Joshua Smith
4eab280d81 fixing docs - duplicates, clarifying 2025-03-28 14:57:56 -04:00
Joshua Smith
5787c181a5 Merge branch 'main' into main 2025-03-28 14:01:02 -04:00
Joshua Smith
1de0a92fa0 adding in Steve's changes to setup 2025-03-28 13:59:26 -04:00
Joshua Smith
460896e68c Merge pull request #5 from joshmsmith/development
adding money movement scenario
2025-03-27 11:24:24 -04:00
Joshua Smith
50d3e3d638 setup instructions for money movement/real workflow or fake, and some minor code cleanup 2025-03-27 11:22:17 -04:00
Joshua Smith
64f8a34d19 improved money movement scenario 2025-03-27 09:24:44 -04:00
Joshua Smith
a3ec7b045a adding move money scenario - still a bit rough but it works 2025-03-26 13:21:13 -04:00
Steve Androulakis
10acca513f better readme around failures 2025-03-24 19:18:10 -07:00
Joshua Smith
82078f217a Merge pull request #4 from joshmsmith/development
Finishing HR, adding fin category, polishing
2025-03-20 16:15:04 -04:00
Joshua Smith
4eeab32cb2 Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-20 13:37:57 -04:00
Joshua Smith
c3084eec41 adding fin goals and tools 2025-03-20 13:37:51 -04:00
Laine Smith
8f802819bf Add slides 2025-03-20 12:51:40 -04:00
Laine
d8a8fe44f9 Add additional hints 2025-03-19 15:28:20 -04:00
Laine
3debef5781 Add clarification re: format of start and end dates, and (probably?) fix non-determinism error caused by SHOW_CONFIRM 2025-03-19 12:48:49 -04:00
Joshua Smith
850404e0d5 updaates to readme, docs, added some logging in case goal setting goes bad 2025-03-19 08:08:38 -04:00
Joshua Smith
c20f5d796f updating docs and todo 2025-03-19 07:49:06 -04:00
Joshua Smith
bd1cfbad01 two more HR scenarios added 2025-03-18 15:44:03 -04:00
Joshua Smith
4bbdda934f Merge pull request #3 from joshmsmith/development
Major improvements to docs, adding more tools/scenarios for pirate mode
2025-03-18 09:36:19 -04:00
Laine
f2ab6c03e8 Remove sample conversation re: conflict checking 2025-03-18 09:18:16 -04:00
Laine
c1b662090d Add pirate treasure goal and more info to documentation re: how to make goals and tools 2025-03-17 16:01:30 -04:00
Steve Androulakis
dfb80f7723 Update README.md 2025-03-15 03:47:29 -07:00
Joshua Smith
d20c6c53a5 more stuff done 2025-03-14 12:00:49 -04:00
Joshua Smith
ee2328fec6 Merge pull request #2 from joshmsmith/development
more changes to scenarios, workflow simplification, docs improvements...more stuff, logging

It's cool
2025-03-14 11:56:27 -04:00
Joshua Smith
87d2320b6a architecture section done 2025-03-14 11:43:51 -04:00
Laine
8d2099fa8e Remove one extra print() statement 2025-03-14 10:26:58 -04:00
Laine
2472558f0c Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-14 10:20:14 -04:00
Laine
5f8f81a15d Move HR-related tools to their own folder, add print statement for BookPTO functionality, and add SILLY_MODE 2025-03-14 10:20:11 -04:00
Joshua Smith
9ead007849 change logging to info unless there needs to be a warning 2025-03-14 10:16:16 -04:00
Joshua Smith
36894c91f9 cleaning up workflow code 2025-03-14 10:13:07 -04:00
Joshua Smith
c8a0feaa1b logging level to WARN 2025-03-14 10:12:27 -04:00
Joshua Smith
7153c5308a set logging level (to Info) 2025-03-14 10:12:09 -04:00
Joshua Smith
f767cfdc51 todo/logging 2025-03-14 10:11:45 -04:00
Joshua Smith
72fe638485 more notes about tools 2025-03-14 10:11:33 -04:00
Joshua Smith
621e811aa8 Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-14 10:10:59 -04:00
Joshua Smith
4cfe472ca0 notes about tools 2025-03-14 10:10:43 -04:00
Laine
ece3ac1d3c Add the category tag to the goals and example env file, and filter the results based on tags in list_agents 2025-03-13 14:53:03 -04:00
Laine
134414f647 Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-13 14:19:15 -04:00
Laine
a7a90c3289 Add functionality to future_pto_calc, remove calendar_conflict step from goal 2025-03-13 14:19:13 -04:00
Joshua Smith
232d901054 adding why temporal section to readme 2025-03-13 12:15:47 -04:00
Joshua Smith
42641fe124 doc updates 2025-03-13 11:55:33 -04:00
Joshua Smith
943f8dc187 readme updates 2025-03-13 11:54:04 -04:00
Laine
5ac2a6eb0a Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-13 11:33:41 -04:00
Laine
5c3bfcf957 Add source mocked data file, make current_pto tool functional, rename future_pto to future_pto_calc 2025-03-13 11:33:38 -04:00
Joshua Smith
c723e2f6d8 Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-13 11:29:20 -04:00
Joshua Smith
02473bb49e todo and readme updates 2025-03-13 11:29:13 -04:00
Laine
ea1ad383bb Add in bare bones yet functional HR goal: goal_hr_schedule_pto 2025-03-12 16:54:13 -04:00
Joshua Smith
291bace53d formatting readme 2025-03-12 13:50:40 -04:00
Joshua Smith
4ca9c60aab Merge pull request #1 from joshmsmith/development
confirming
2025-03-12 13:50:02 -04:00
Joshua Smith
3206f81e31 Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-12 13:38:25 -04:00
Joshua Smith
d807e9893d updates to readme, docs, guides 2025-03-12 13:37:04 -04:00
Laine
380581b0d9 Part two of making confirmation optional - add flag to ToolData so the button won't show in the UI 2025-03-12 13:22:04 -04:00
Laine
a488bbac23 Use False, not Off 2025-03-12 12:50:44 -04:00
Laine
1a270fa917 Forgot the env.example... 2025-03-12 12:50:02 -04:00
Laine
02a63917b2 Part of one of making confirmation optional - auto-confirm but still show everything 2025-03-12 12:49:00 -04:00
Laine
504361a5a7 Add a bunch of logging and comments re: what's happenin' 2025-03-12 11:25:57 -04:00
Laine
697244e970 Move AGENT_GOAL back to env file 2025-03-12 10:30:42 -04:00
Laine
e0b3a31ea8 Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-12 10:20:35 -04:00
Laine
0306a5d726 Auto-start workflow if one isn't found to get rid of startup error 2025-03-12 10:20:27 -04:00
Joshua Smith
f969098dc8 finishing grok support 2025-03-12 09:55:25 -04:00
Laine
b52cef0d05 Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-12 09:40:59 -04:00
Laine
df58eee9d4 Change to use goal_list in the api code, add list_agents to the other goal as the last tool 2025-03-12 09:40:56 -04:00
Joshua Smith
b2d6f789d9 updated todo list 2025-03-12 09:37:20 -04:00
Laine
e872c9381d Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-12 09:20:12 -04:00
Laine
947c5cd0f7 Take out specific goals, add back in elif done so the workflow ends 2025-03-12 09:20:09 -04:00
Joshua Smith
c418c185db added test 2025-03-12 09:13:47 -04:00
Laine
fdf5550ea3 Add "done" back in for prompts, remove argument from ListAgents tool def 2025-03-12 09:01:31 -04:00
Joshua Smith
56cccd660d todo updates 2025-03-11 15:53:46 -04:00
Joshua Smith
c0a874b90e added some workflow debugging, converted from "done" to pick-new-goal and updated prompts 2025-03-11 15:52:47 -04:00
Joshua Smith
bb733bc966 updated todo 2025-03-11 15:05:03 -04:00
Laine
3ff3b60b5e Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-11 14:50:30 -04:00
Laine
8db1dcd4a7 Dynamically generate list of agents, try to fix goal changing flow 2025-03-11 14:48:39 -04:00
Joshua Smith
39dabaa81b Merge branch 'development' of https://github.com/joshmsmith/temporal-ai-agent into development 2025-03-11 14:02:22 -04:00
Joshua Smith
ae334a2cae adding grok to .env.example and updating todo 2025-03-11 14:01:23 -04:00
Joshua Smith
b2e4999562 adding and clarifying comments 2025-03-11 13:02:08 -04:00
Laine
f13ed70bfe Change instructions to AI to handle switching back to ListAgents when done with tool chain 2025-03-11 12:02:26 -04:00
Laine
804568e366 Rename ChooseAgent tool to ListAgents 2025-03-11 10:41:22 -04:00
Joshua Smith
64ffe7f635 clean up logging and comments 2025-03-11 10:32:28 -04:00
Joshua Smith
6939e3f942 log less chatgpt stuff and actually change the goal 2025-03-11 10:03:45 -04:00
Laine
8fafe4b090 Change agent goal to be an element of the workflow, including query 2025-03-11 09:07:25 -04:00
Laine
4117d5d62d Add new goal to choose agent type - only kind of working 2025-03-07 16:12:21 -05:00
Joshua Smith
64d2a92630 more understanding 2025-03-07 09:58:25 -05:00
Joshua Smith
4c933b5052 making plans 2025-03-07 09:46:22 -05:00
Laine
d09db9f11f Move where goal is set, make dummy data default for create_invoice 2025-03-05 17:24:18 -05:00
Steve Androulakis
6accc1f2e6 Merge pull request #22 from steveandroulakis/keynote-main
better error handling for workers down.
2025-03-03 01:45:21 -06:00
Steve Androulakis
ac05e8f60b Merge pull request #21 from steveandroulakis/keynote-main
pre-warm ollama local model on initialization
2025-02-28 07:32:55 -06:00
Steve Androulakis
f0a76e42cd Merge pull request #20 from steveandroulakis/keynote-main
readme sync
2025-02-28 07:12:20 -06:00
Steve Androulakis
d9480612fa Merge pull request #19 from steveandroulakis/keynote-main
license
2025-02-28 07:11:02 -06:00
Steve Androulakis
32f76eacb4 Merge pull request #18 from steveandroulakis/keynote-main
Keynote main
2025-02-28 07:09:05 -06:00
Steve Androulakis
31b0b9ff0a Create LICENSE 2025-02-27 13:37:01 -06:00
Steve Androulakis
ecbb66523f Merge pull request #16 from steveandroulakis/keynote-main
Keynote main
2025-02-25 05:18:01 -08:00
Steve Androulakis
7d23e42def Merge pull request #14 from steveandroulakis/keynote-main
upgrade claude sonnet to 3.7 and prompt eng
2025-02-24 13:44:20 -08:00
Steve Androulakis
98f3de3bb4 Merge pull request #13 from steveandroulakis/keynote-main
more realistic train times because demo will be to manchester
2025-02-24 10:59:12 -08:00
Steve Androulakis
079e0a12e5 Update README.md 2025-02-20 17:15:45 -08:00
Steve Androulakis
70bd11b1a9 Merge pull request #12 from steveandroulakis/keynote-main
Merged keynote-main branch in for dual agent functionality
2025-02-20 15:34:19 -08:00
99 changed files with 9920 additions and 1555 deletions

View File

@@ -1,36 +1,48 @@
RAPIDAPI_KEY=9df2cb5...
RAPIDAPI_HOST=sky-scrapper.p.rapidapi.com
FOOTBALL_DATA_API_KEY=....
# Example environment configuration
### LLM configuration
LLM_MODEL=openai/gpt-4o
LLM_KEY=sk-proj-...
# LLM_MODEL=anthropic/claude-3-5-sonnet-20240620
# LLM_KEY=${ANTHROPIC_API_KEY}
# LLM_MODEL=gemini/gemini-2.5-flash-preview-04-17
# LLM_KEY=${GOOGLE_API_KEY}
### Tool API keys
# RAPIDAPI_KEY=9df2cb5... # Optional - if unset flight search generates realistic mock data
# RAPIDAPI_HOST_FLIGHTS=sky-scrapper.p.rapidapi.com # For real travel flight information (optional)
RAPIDAPI_HOST_PACKAGE=trackingpackage.p.rapidapi.com # For eCommerce order status package tracking tool
FOOTBALL_DATA_API_KEY=
# Leave blank to use the built-in mock fixtures generator
STRIPE_API_KEY=sk_test_51J...
# Optional for `goal_event_flight_invoice` if unset a mock invoice is created.
# Sign up for a free Stripe account and get a test key at https://dashboard.stripe.com/test/apikeys
LLM_PROVIDER=openai # default
OPENAI_API_KEY=sk-proj-...
# or
# LLM_PROVIDER=ollama
# OLLAMA_MODEL_NAME=qwen2.5:14b
# or
# LLM_PROVIDER=google
# GOOGLE_API_KEY=your-google-api-key
# or
# LLM_PROVIDER=anthropic
# ANTHROPIC_API_KEY=your-anthropic-api-key
# or
# LLM_PROVIDER=deepseek
# DEEPSEEK_API_KEY=your-deepseek-api-key
# uncomment and unset these environment variables to connect to the local dev server
### Temporal connection (optional)
# Uncomment and update these values to connect to a non-default Temporal server
# TEMPORAL_ADDRESS=namespace.acct.tmprl.cloud:7233
# TEMPORAL_NAMESPACE=default
# TEMPORAL_TASK_QUEUE=agent-task-queue
# Uncomment if using mTLS (not needed for local dev server)
# TEMPORAL_TLS_CERT='path/to/cert.pem'
# TEMPORAL_TLS_KEY='path/to/key.pem'
# Uncomment if using API key (not needed for local dev server)
# TEMPORAL_API_KEY=abcdef1234567890
# Agent Goal Configuration
# AGENT_GOAL=goal_event_flight_invoice # (default) or goal_match_train_invoice
### Agent goal configuration
# Set starting goal of agent - if unset default is goal_event_flight_invoice (single agent mode)
#AGENT_GOAL=goal_choose_agent_type # for multi-goal mode (experimental)
AGENT_GOAL=goal_event_flight_invoice
#AGENT_GOAL=goal_match_train_invoice # for replay goal
# Choose which goal categories are listed by the Agent Goal picker if enabled above
# Options: system (always included), hr, travel-flights, travel-trains, fin, ecommerce, mcp-integrations, food, all
GOAL_CATEGORIES=all
#GOAL_CATEGORIES=travel-flights
### Other settings
SHOW_CONFIRM=True
# Money Scenarios:
# Set if you want it to really start workflows - otherwise it'll fake it
# if you want it to be real you'll need moneytransfer and early return workers running
FIN_START_REAL_WORKFLOW=FALSE

6
.gitignore vendored
View File

@@ -31,4 +31,8 @@ coverage.xml
# PyCharm / IntelliJ settings
.idea/
.env
.env
.env*
# Cursor
.cursor

201
AGENTS.md Normal file
View File

@@ -0,0 +1,201 @@
# Temporal AI Agent Contribution Guide
## Repository Layout
- `workflows/` - Temporal workflows including the main AgentGoalWorkflow for multi-turn AI conversations
- `activities/` - Temporal activities for tool execution and LLM interactions
- `tools/` - Native AI agent tool implementations organized by category (finance, HR, ecommerce, travel, etc.)
- `goals/` - Agent goal definitions organized by category, supporting both native and MCP tools
- `shared/` - Shared configuration including MCP server definitions
- `models/` - Data types and tool definitions used throughout the system
- `prompts/` - Agent prompt generators and templates
- `api/` - FastAPI server that exposes REST endpoints to interact with workflows
- `frontend/` - React-based web UI for chatting with the AI agent
- `tests/` - Comprehensive test suite for workflows and activities using Temporal's testing framework
- `enterprise/` - .NET worker implementation for enterprise activities (train booking)
- `scripts/` - Utility scripts for running workers and testing tools
## Running the Application
### Quick Start with Docker
```bash
# Start all services with development hot-reload
docker compose up -d
# Quick rebuild without infrastructure
docker compose up -d --no-deps --build api worker frontend
```
Default URLs:
- Temporal UI: http://localhost:8080
- API: http://localhost:8000
- Frontend: http://localhost:5173
### Local Development Setup
1. **Prerequisites:**
```bash
# Install Poetry for Python dependency management
curl -sSL https://install.python-poetry.org | python3 -
# Start Temporal server (Mac)
brew install temporal
temporal server start-dev
```
2. **Backend (Python):**
```bash
# Quick setup using Makefile
make setup # Creates venv and installs dependencies
make run-worker # Starts the Temporal worker
make run-api # Starts the API server
# Or manually:
poetry install
poetry run python scripts/run_worker.py # In one terminal
poetry run uvicorn api.main:app --reload # In another terminal
```
3. **Frontend (React):**
```bash
make run-frontend # Using Makefile
# Or manually:
cd frontend
npm install
npx vite
```
4. **Enterprise .NET Worker (optional):**
```bash
make run-enterprise # Using Makefile
# Or manually:
cd enterprise
dotnet build
dotnet run
```
### Environment Configuration
Copy `.env.example` to `.env` and configure:
```bash
# Required: LLM Configuration
LLM_MODEL=openai/gpt-4o
LLM_KEY=your-api-key-here
# LLM_MODEL=anthropic/claude-3-5-sonnet-20240620
# LLM_KEY=${ANTHROPIC_API_KEY}
# LLM_MODEL=gemini/gemini-2.5-flash-preview-04-17
# LLM_KEY=${GOOGLE_API_KEY}
# Optional: Agent Goals and Categories
AGENT_GOAL=goal_choose_agent_type
GOAL_CATEGORIES=hr,travel-flights,travel-trains,fin,ecommerce,mcp-integrations,food
# Optional: Tool-specific APIs
STRIPE_API_KEY=sk_test_... # For invoice creation
# `goal_event_flight_invoice` works without this key it falls back to a mock invoice if unset
FOOTBALL_DATA_API_KEY=... # For real football fixtures
```
## Testing
The project includes comprehensive tests using Temporal's testing framework:
```bash
# Install test dependencies
poetry install --with dev
# Run all tests
poetry run pytest
# Run with time-skipping for faster execution
poetry run pytest --workflow-environment=time-skipping
# Run specific test categories
poetry run pytest tests/test_tool_activities.py -v # Activity tests
poetry run pytest tests/test_agent_goal_workflow.py -v # Workflow tests
# Run with coverage
poetry run pytest --cov=workflows --cov=activities
```
**Test Coverage:**
- ✅ **Workflow Tests**: AgentGoalWorkflow signals, queries, state management
- ✅ **Activity Tests**: ToolActivities, LLM integration (mocked), environment configuration
- ✅ **Integration Tests**: End-to-end workflow and activity execution
**Documentation:**
- **Quick Start**: [testing.md](docs/testing.md) - Simple commands to run tests
- **Comprehensive Guide**: [tests/README.md](tests/README.md) - Detailed testing patterns and best practices
## Linting and Code Quality
```bash
# Using Poetry tasks
poetry run poe format # Format code with black and isort
poetry run poe lint # Check code style and types
poetry run poe test # Run test suite
# Manual commands
poetry run black .
poetry run isort .
poetry run mypy --check-untyped-defs --namespace-packages .
```
## Agent Customization
### Adding New Goals and Tools
#### For Native Tools:
1. Create tool implementation in `tools/` directory
2. Add tool function mapping in `tools/__init__.py`
3. Register tool definition in `tools/tool_registry.py`
4. Add tool names to static tools list in `workflows/workflow_helpers.py`
5. Create or update goal definition in appropriate file in `goals/` directory
#### For MCP Tools:
1. Configure MCP server definition in `shared/mcp_config.py` (for reusable servers)
2. Create or update goal definition in appropriate file in `goals/` directory with `mcp_server_definition`
3. Set required environment variables (API keys, etc.)
#### For Goals:
1. Create goal file in `goals/` directory (e.g., `goals/my_category.py`)
2. Import and extend the goal list in `goals/__init__.py`
### Configuring Goals
The agent supports multiple goal categories organized in `goals/`:
- **Financial**: Money transfers, loan applications (`goals/finance.py`)
- **HR**: PTO booking, payroll status (`goals/hr.py`)
- **Travel**: Flight/train booking, event finding (`goals/travel.py`)
- **Ecommerce**: Order tracking, package management (`goals/ecommerce.py`)
- **Food**: Restaurant ordering and cart management (`goals/food.py`)
- **MCP Integrations**: External service integrations like Stripe (`goals/stripe_mcp.py`)
Goals can use:
- **Native Tools**: Custom implementations in `/tools/` directory
- **MCP Tools**: External tools via Model Context Protocol servers (configured in `shared/mcp_config.py`)
See [adding-goals-and-tools.md](docs/adding-goals-and-tools.md) for detailed customization guide.
## Architecture
This system implements agentic AI—autonomous systems that pursue goals through iterative tool use and human feedback—with these key components:
1. **Goals** - High-level objectives accomplished through tool sequences (organized in `/goals/` by category)
2. **Native & MCP Tools** - Custom implementations and external service integrations
3. **Agent Loops** - LLM execution → tool calls → human input → repeat until goal completion
4. **Tool Approval** - Human confirmation for sensitive operations
5. **Conversation Management** - LLM-powered input validation and history summarization
6. **Durability** - Temporal workflows ensure reliable execution across failures
For detailed architecture information, see [architecture.md](docs/architecture.md).
## Commit Messages and Pull Requests
- Use clear commit messages describing the change purpose
- Reference specific files and line numbers when relevant (e.g., `workflows/agent_goal_workflow.py:125`)
- Open PRs describing **what changed** and **why**
- Ensure tests pass before submitting: `poetry run pytest --workflow-environment=time-skipping`
## Additional Resources
- **Setup Guide**: [setup.md](docs/setup.md) - Detailed configuration instructions
- **Architecture Decisions**: [architecture-decisions.md](docs/architecture-decisions.md) - Why Temporal for AI agents
- **Demo Video**: [5-minute YouTube overview](https://www.youtube.com/watch?v=GEXllEH2XiQ)
- **Multi-Agent Demo**: [Advanced multi-agent execution](https://www.youtube.com/watch?v=8Dc_0dC14yY)

30
Dockerfile Normal file
View File

@@ -0,0 +1,30 @@
FROM python:3.10-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends gcc build-essential && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Copy requirements first for better caching
RUN pip install --no-cache-dir poetry
# Install Python dependencies without creating a virtualenv
COPY pyproject.toml poetry.lock ./
RUN poetry config virtualenvs.create false \
&& poetry install --without dev --no-interaction --no-ansi --no-root
# Copy application code
COPY . .
# Set Python to run in unbuffered mode (recommended for Docker)
ENV PYTHONUNBUFFERED=1
ENV PYTHONPATH=/app
# Expose the port the app will run on
EXPOSE 8000
# Default to running only the API server; worker and train-api are separate Compose services
CMD ["uvicorn", "api.main:app", "--host", "0.0.0.0", "--port", "8000"]

63
Makefile Normal file
View File

@@ -0,0 +1,63 @@
.PHONY: setup install run-worker run-api run-frontend run-train-api run-legacy-worker run-enterprise setup-venv check-python run-dev
# Setup commands
setup: check-python setup-venv install
check-python:
@which python3 >/dev/null 2>&1 || (echo "Python 3 is required. Please install it first." && exit 1)
@which poetry >/dev/null 2>&1 || (echo "Poetry is required. Please install it first." && exit 1)
setup-venv:
python3 -m venv venv
@echo "Virtual environment created. Don't forget to activate it with 'source venv/bin/activate'"
install:
poetry install
cd frontend && npm install
# Run commands
run-worker:
poetry run python scripts/run_worker.py
run-api:
poetry run uvicorn api.main:app --reload
run-frontend:
cd frontend && npx vite
run-train-api:
poetry run python thirdparty/train_api.py
run-legacy-worker:
poetry run python scripts/run_legacy_worker.py
run-enterprise:
cd enterprise && dotnet build && dotnet run
# Development environment setup
setup-temporal-mac:
brew install temporal
temporal server start-dev
# Run all development services
run-dev:
@echo "Starting all development services..."
@make run-worker & \
make run-api & \
make run-frontend & \
wait
# Help command
help:
@echo "Available commands:"
@echo " make setup - Create virtual environment and install dependencies"
@echo " make setup-venv - Create virtual environment only"
@echo " make install - Install all dependencies"
@echo " make run-worker - Start the Temporal worker"
@echo " make run-api - Start the API server"
@echo " make run-frontend - Start the frontend development server"
@echo " make run-train-api - Start the train API server"
@echo " make run-legacy-worker - Start the legacy worker"
@echo " make run-enterprise - Build and run the enterprise .NET worker"
@echo " make setup-temporal-mac - Install and start Temporal server on Mac"
@echo " make run-dev - Start all development services (worker, API, frontend) in parallel"

240
README.md
View File

@@ -1,192 +1,104 @@
# Temporal AI Agent
This demo shows a multi-turn conversation with an AI agent running inside a Temporal workflow. The purpose of the agent is to collect information towards a goal, running tools along the way. There's a simple DSL input for collecting information (currently set up to use mock functions to search for public events, search for flights around those events, then create a test Stripe invoice for the trip).
This demo shows a multi-turn conversation with an AI agent running inside a Temporal workflow. The purpose of the agent is to collect information towards a goal, running tools along the way. The agent supports both native tools and Model Context Protocol (MCP) tools, allowing it to interact with external services.
The AI will respond with clarifications and ask for any missing information to that goal. You can configure it to use [ChatGPT 4o](https://openai.com/index/hello-gpt-4o/), [Anthropic Claude](https://www.anthropic.com/claude), [Google Gemini](https://gemini.google.com), [Deepseek-V3](https://www.deepseek.com/) or a local LLM of your choice using [Ollama](https://ollama.com).
The agent operates in single-agent mode by default, focusing on one specific goal. It also supports experimental multi-agent/multi-goal mode where users can choose between different agent types and switch between them during conversations.
[Watch the demo (5 minute YouTube video)](https://www.youtube.com/watch?v=GEXllEH2XiQ)
Goals are organized in the `/goals/` directory by category (finance, HR, travel, ecommerce, etc.) and can leverage both native and MCP tools.
[![Watch the demo](./agent-youtube-screenshot.jpeg)](https://www.youtube.com/watch?v=GEXllEH2XiQ)
The AI will respond with clarifications and ask for any missing information to that goal. You can configure it to use any LLM supported by [LiteLLM](https://docs.litellm.ai/docs/providers), including:
- OpenAI models (GPT-4, GPT-3.5)
- Anthropic Claude models
- Google Gemini models
- Deepseek models
- Ollama models (local)
- And many more!
## Configuration
It's really helpful to [watch the demo (5 minute YouTube video)](https://www.youtube.com/watch?v=GEXllEH2XiQ) to understand how interaction works.
This application uses `.env` files for configuration. Copy the [.env.example](.env.example) file to `.env` and update the values:
[![Watch the demo](./assets/agent-youtube-screenshot.jpeg)](https://www.youtube.com/watch?v=GEXllEH2XiQ)
### Multi-Agent Demo Video
See multi-agent execution in action [here](https://www.youtube.com/watch?v=8Dc_0dC14yY).
## Why Temporal?
There are a lot of AI and Agentic AI tools out there, and more on the way. But why Temporal? Temporal gives this system reliablity, state management, a code-first approach that we really like, built-in observability and easy error handling.
For more, check out [architecture-decisions](docs/architecture-decisions.md).
## What is "Agentic AI"?
These are the key elements of an agentic framework:
1. Goals that a system can accomplish, made up of tools that can execute individual steps
2. Agent loops - executing an LLM, executing tools, and eliciting input from an external source such as a human: repeat until goal(s) are done
3. Support for tool calls that require input and approval
4. Use of an LLM to check human input for relevance before calling the 'real' LLM
5. Use of an LLM to summarize and compact the conversation history
6. Prompt construction made of system prompts, conversation history, and tool metadata - sent to the LLM to create user questions and confirmations
7. Ideally high durability (done in this system with Temporal Workflow and Activities)
For a deeper dive into this, check out the [architecture guide](docs/architecture.md).
## 🔧 MCP Tool Calling Support
This agent acts as an **MCP (Model Context Protocol) client**, enabling seamless integration with external services and tools. The system supports two types of tools:
- **Native Tools**: Custom tools implemented directly in the codebase (in `/tools/`)
- **MCP Tools**: External tools accessed via Model Context Protocol (MCP) servers like Stripe, databases, or APIs. Configuration is covered in [the Setup guide](docs/setup.md)
- Set `AGENT_GOAL=goal_food_ordering` with `SHOW_CONFIRM=False` in `.env` for an example of a goal that calls MCP Tools (Stripe).
## Setup and Configuration
See [the Setup guide](docs/setup.md) for detailed instructions. The basic configuration requires just two environment variables:
```bash
cp .env.example .env
LLM_MODEL=openai/gpt-4o # or any other model supported by LiteLLM
LLM_KEY=your-api-key-here
```
### Agent Goal Configuration
## Customizing Interaction & Tools
See [the guide to adding goals and tools](docs/adding-goals-and-tools.md).
The agent can be configured to pursue different goals using the `AGENT_GOAL` environment variable in your `.env` file.
The system supports MCP (Model Context Protocol) for easy integration with external services. MCP server configurations are managed in `shared/mcp_config.py`, and goals are organized by category in the `/goals/` directory.
#### Goal: Find an event in Australia / New Zealand, book flights to it and invoice the user for the cost
- `AGENT_GOAL=goal_event_flight_invoice` (default) - Helps users find events, book flights, and arrange train travel with invoice generation
- This is the scenario in the video above
## Architecture
See [the architecture guide](docs/architecture.md).
#### Goal: Find a Premier League match, book train tickets to it and invoice the user for the cost
- `AGENT_GOAL=goal_match_train_invoice` - Focuses on Premier League match attendance with train booking and invoice generation
- This is a new goal that is part of an upcoming conference talk
## Testing
If not specified, the agent defaults to `goal_event_flight_invoice`. Each goal comes with its own set of tools and conversation flows designed for specific use cases. You can examine `tools/goal_registry.py` to see the detailed configuration of each goal.
The project includes comprehensive tests for workflows and activities using Temporal's testing framework:
See the next section for tool configuration for each goal.
### Tool Configuration
#### Agent Goal: goal_event_flight_invoice (default)
* The agent uses a mock function to search for events. This has zero configuration.
* By default the agent uses a mock function to search for flights.
* If you want to use the real flights API, go to `tools/search_flights.py` and replace the `search_flights` function with `search_flights_real_api` that exists in the same file.
* It's free to sign up at [RapidAPI](https://rapidapi.com/apiheya/api/sky-scrapper)
* This api might be slow to respond, so you may want to increase the start to close timeout, `TOOL_ACTIVITY_START_TO_CLOSE_TIMEOUT` in `workflows/workflow_helpers.py`
* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in .env
* It's free to sign up and get a key at [Stripe](https://stripe.com/)
* If you're lazy go to `tools/create_invoice.py` and replace the `create_invoice` function with the mock `create_invoice_example` that exists in the same file.
#### Agent Goal: goal_match_train_invoice
* Finding a match requires a key from [Football Data](https://www.football-data.org). Sign up for a free account, then see the 'My Account' page to get your API token. Set `FOOTBALL_DATA_API_KEY` to this value.
* If you're lazy go to `tools/search_fixtures.py` and replace the `search_fixtures` function with the mock `search_fixtures_example` that exists in the same file.
* We use a mock function to search for trains. Start the train API server to use the real API: `python thirdparty/train_api.py`
* * The train activity is 'enterprise' so it's written in C# and requires a .NET runtime. See the [.NET backend](#net-(enterprise)-backend) section for details on running it.
* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in .env
* It's free to sign up and get a key at [Stripe](https://stripe.com/)
* If you're lazy go to `tools/create_invoice.py` and replace the `create_invoice` function with the mock `create_invoice_example` that exists in the same file.
### LLM Provider Configuration
The agent can use OpenAI's GPT-4o, Google Gemini, Anthropic Claude, or a local LLM via Ollama. Set the `LLM_PROVIDER` environment variable in your `.env` file to choose the desired provider:
- `LLM_PROVIDER=openai` for OpenAI's GPT-4o
- `LLM_PROVIDER=google` for Google Gemini
- `LLM_PROVIDER=anthropic` for Anthropic Claude
- `LLM_PROVIDER=deepseek` for DeepSeek-V3
- `LLM_PROVIDER=ollama` for running LLMs via [Ollama](https://ollama.ai) (not recommended for this use case)
### Option 1: OpenAI
If using OpenAI, ensure you have an OpenAI key for the GPT-4o model. Set this in the `OPENAI_API_KEY` environment variable in `.env`.
### Option 2: Google Gemini
To use Google Gemini:
1. Obtain a Google API key and set it in the `GOOGLE_API_KEY` environment variable in `.env`.
2. Set `LLM_PROVIDER=google` in your `.env` file.
### Option 3: Anthropic Claude (recommended)
I find that Claude Sonnet 3.5 performs better than the other hosted LLMs for this use case.
To use Anthropic:
1. Obtain an Anthropic API key and set it in the `ANTHROPIC_API_KEY` environment variable in `.env`.
2. Set `LLM_PROVIDER=anthropic` in your `.env` file.
### Option 4: Deepseek-V3
To use Deepseek-V3:
1. Obtain a Deepseek API key and set it in the `DEEPSEEK_API_KEY` environment variable in `.env`.
2. Set `LLM_PROVIDER=deepseek` in your `.env` file.
### Option 5: Local LLM via Ollama (not recommended)
To use a local LLM with Ollama:
1. Install [Ollama](https://ollama.com) and the [Qwen2.5 14B](https://ollama.com/library/qwen2.5) model.
- Run `ollama run <OLLAMA_MODEL_NAME>` to start the model. Note that this model is about 9GB to download.
- Example: `ollama run qwen2.5:14b`
2. Set `LLM_PROVIDER=ollama` in your `.env` file and `OLLAMA_MODEL_NAME` to the name of the model you installed.
Note: I found the other (hosted) LLMs to be MUCH more reliable for this use case. However, you can switch to Ollama if desired, and choose a suitably large model if your computer has the resources.
## Configuring Temporal Connection
By default, this application will connect to a local Temporal server (`localhost:7233`) in the default namespace, using the `agent-task-queue` task queue. You can override these settings in your `.env` file.
### Use Temporal Cloud
See [.env.example](.env.example) for details on connecting to Temporal Cloud using mTLS or API key authentication.
[Sign up for Temporal Cloud](https://temporal.io/get-cloud)
### Use a local Temporal Dev Server
On a Mac
```bash
brew install temporal
temporal server start-dev
```
See the [Temporal documentation](https://learn.temporal.io/getting_started/python/dev_environment/) for other platforms.
# Install dependencies including test dependencies
poetry install --with dev
# Run all tests
poetry run pytest
## Running the Application
### Python Backend
Requires [Poetry](https://python-poetry.org/) to manage dependencies.
1. `python -m venv venv`
2. `source venv/bin/activate`
3. `poetry install`
Run the following commands in separate terminal windows:
1. Start the Temporal worker:
```bash
poetry run python scripts/run_worker.py
# Run with time-skipping for faster execution
poetry run pytest --workflow-environment=time-skipping
```
2. Start the API server:
```bash
poetry run uvicorn api.main:app --reload
```
Access the API at `/docs` to see the available endpoints.
**Test Coverage:**
-**Workflow Tests**: AgentGoalWorkflow signals, queries, state management
-**Activity Tests**: ToolActivities, LLM integration (mocked), environment configuration
-**Integration Tests**: End-to-end workflow and activity execution
### React UI
Start the frontend:
```bash
cd frontend
npm install
npx vite
```
Access the UI at `http://localhost:5173`
- **Quick Start**: [testing.md](docs/testing.md) - Simple commands to run tests
- **Comprehensive Guide**: [tests/README.md](tests/README.md) - Detailed testing documentation, patterns, and best practices
### Python Search Trains API
> Agent Goal: goal_match_train_invoice only
## Development
Required to search and book trains!
```bash
poetry run python thirdparty/train_api.py
To contribute to this project, see [contributing.md](docs/contributing.md).
# example url
# http://localhost:8080/api/search?from=london&to=liverpool&outbound_time=2025-04-18T09:00:00&inbound_time=2025-04-20T09:00:00
```
Start the Temporal Server and API server, see [setup](docs/setup.md)
### .NET (enterprise) Backend ;)
> Agent Goal: goal_match_train_invoice only
We have activities written in C# to call the train APIs.
```bash
cd enterprise
dotnet build # ensure you brew install dotnet@8 first!
dotnet run
```
If you're running your train API above on a different host/port then change the API URL in `Program.cs`. Otherwise, be sure to run it using `python thirdparty/train_api.py`.
## Customizing the Agent
- `tool_registry.py` contains the mapping of tool names to tool definitions (so the AI understands how to use them)
- `goal_registry.py` contains descriptions of goals and the tools used to achieve them
- The tools themselves are defined in their own files in `/tools`
- Note the mapping in `tools/__init__.py` to each tool
## TODO
## Productionalization & Adding Features
- In a prod setting, I would need to ensure that payload data is stored separately (e.g. in S3 or a noSQL db - the claim-check pattern), or otherwise 'garbage collected'. Without these techniques, long conversations will fill up the workflow's conversation history, and start to breach Temporal event history payload limits.
- Continue-as-new shouldn't be a big consideration for this use case (as it would take many conversational turns to trigger). Regardless, I should ensure that it's able to carry the agent state over to the new workflow execution.
- A single worker can easily support many agent workflows (chats) running at the same time. Currently the workflow ID is the same each time, so it will only run one agent at a time. To run multiple agents, you can use a different workflow ID each time (e.g. by using a UUID or timestamp).
- Perhaps the UI should show when the LLM response is being retried (i.e. activity retry attempt because the LLM provided bad output)
- Tests would be nice!
- The project now includes comprehensive tests for workflows and activities! [See testing guide](docs/testing.md).
See [the todo](docs/todo.md) for more details on things we want to do (or that you could contribute!).
See [the guide to adding goals and tools](docs/adding-goals-and-tools.md) for more ways you can add features.
## Enablement Guide (internal resource for Temporal employees)
Check out the [slides](https://docs.google.com/presentation/d/1wUFY4v17vrtv8llreKEBDPLRtZte3FixxBUn0uWy5NU/edit#slide=id.g3333e5deaa9_0_0) here and the [enablement guide](https://docs.google.com/document/d/14E0cEOibUAgHPBqConbWXgPUBY0Oxrnt6_AImdiheW4/edit?tab=t.0#heading=h.ajnq2v3xqbu1).

View File

@@ -1,128 +1,51 @@
from temporalio import activity
from ollama import chat, ChatResponse
from openai import OpenAI
import inspect
import json
from typing import Sequence, Optional
from temporalio.common import RawValue
import os
from contextlib import asynccontextmanager
from datetime import datetime
import google.generativeai as genai
import anthropic
import deepseek
from typing import Any, Dict, List, Optional, Sequence
from dotenv import load_dotenv
from models.data_types import ValidationInput, ValidationResult, ToolPromptInput
from litellm import completion
from temporalio import activity
from temporalio.common import RawValue
from temporalio.exceptions import ApplicationError
from models.data_types import (
EnvLookupInput,
EnvLookupOutput,
ToolPromptInput,
ValidationInput,
ValidationResult,
)
from models.tool_definitions import MCPServerDefinition
from shared.mcp_client_manager import MCPClientManager
# Import MCP client libraries
try:
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
except ImportError:
# Fallback if MCP not installed
ClientSession = None
StdioServerParameters = None
stdio_client = None
load_dotenv(override=True)
print(
"Using LLM provider: "
+ os.environ.get("LLM_PROVIDER", "openai")
+ " (set LLM_PROVIDER in .env to change)"
)
if os.environ.get("LLM_PROVIDER") == "ollama":
print(
"Using Ollama (local) model: "
+ os.environ.get("OLLAMA_MODEL_NAME", "qwen2.5:14b")
)
class ToolActivities:
def __init__(self):
"""Initialize LLM clients based on environment configuration."""
self.llm_provider = os.environ.get("LLM_PROVIDER", "openai").lower()
print(f"Initializing ToolActivities with LLM provider: {self.llm_provider}")
# Initialize client variables (all set to None initially)
self.openai_client: Optional[OpenAI] = None
self.anthropic_client: Optional[anthropic.Anthropic] = None
self.genai_configured: bool = False
self.deepseek_client: Optional[deepseek.DeepSeekAPI] = None
self.ollama_model_name: Optional[str] = None
self.ollama_initialized: bool = False
# Only initialize the client specified by LLM_PROVIDER
if self.llm_provider == "openai":
if os.environ.get("OPENAI_API_KEY"):
self.openai_client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
print("Initialized OpenAI client")
else:
print("Warning: OPENAI_API_KEY not set but LLM_PROVIDER is 'openai'")
elif self.llm_provider == "anthropic":
if os.environ.get("ANTHROPIC_API_KEY"):
self.anthropic_client = anthropic.Anthropic(
api_key=os.environ.get("ANTHROPIC_API_KEY")
)
print("Initialized Anthropic client")
else:
print(
"Warning: ANTHROPIC_API_KEY not set but LLM_PROVIDER is 'anthropic'"
)
elif self.llm_provider == "google":
api_key = os.environ.get("GOOGLE_API_KEY")
if api_key:
genai.configure(api_key=api_key)
self.genai_configured = True
print("Configured Google Generative AI")
else:
print("Warning: GOOGLE_API_KEY not set but LLM_PROVIDER is 'google'")
elif self.llm_provider == "deepseek":
if os.environ.get("DEEPSEEK_API_KEY"):
self.deepseek_client = deepseek.DeepSeekAPI(
api_key=os.environ.get("DEEPSEEK_API_KEY")
)
print("Initialized DeepSeek client")
else:
print(
"Warning: DEEPSEEK_API_KEY not set but LLM_PROVIDER is 'deepseek'"
)
# For Ollama, we store the model name but actual initialization happens in warm_up_ollama
elif self.llm_provider == "ollama":
self.ollama_model_name = os.environ.get("OLLAMA_MODEL_NAME", "qwen2.5:14b")
print(
f"Using Ollama model: {self.ollama_model_name} (will be loaded on worker startup)"
)
else:
print(
f"Warning: Unknown LLM_PROVIDER '{self.llm_provider}', defaulting to OpenAI"
)
def warm_up_ollama(self):
"""Pre-load the Ollama model to avoid cold start latency on first request"""
if self.llm_provider != "ollama" or self.ollama_initialized:
return False # No need to warm up if not using Ollama or already warmed up
try:
print(
f"Pre-loading Ollama model '{self.ollama_model_name}' - this may take 30+ seconds..."
)
start_time = datetime.now()
# Make a simple request to load the model into memory
chat(
model=self.ollama_model_name,
messages=[
{"role": "system", "content": "You are an AI assistant"},
{
"role": "user",
"content": "Hello! This is a warm-up message to load the model.",
},
],
)
elapsed_time = (datetime.now() - start_time).total_seconds()
print(f"✅ Ollama model loaded successfully in {elapsed_time:.2f} seconds")
self.ollama_initialized = True
return True
except Exception as e:
print(f"❌ Error pre-loading Ollama model: {str(e)}")
print(
"The worker will continue, but the first actual request may experience a delay."
)
return False
def __init__(self, mcp_client_manager: MCPClientManager = None):
"""Initialize LLM client using LiteLLM and optional MCP client manager"""
self.llm_model = os.environ.get("LLM_MODEL", "openai/gpt-4")
self.llm_key = os.environ.get("LLM_KEY")
self.llm_base_url = os.environ.get("LLM_BASE_URL")
self.mcp_client_manager = mcp_client_manager
print(f"Initializing ToolActivities with LLM model: {self.llm_model}")
if self.llm_base_url:
print(f"Using custom base URL: {self.llm_base_url}")
if self.mcp_client_manager:
print("MCP client manager enabled for connection pooling")
@activity.defn
async def agent_validatePrompt(
@@ -178,7 +101,7 @@ class ToolActivities:
prompt=validation_prompt, context_instructions=context_instructions
)
result = self.agent_toolPlanner(prompt_input)
result = await self.agent_toolPlanner(prompt_input)
return ValidationResult(
validationResult=result.get("validationResult", False),
@@ -186,17 +109,49 @@ class ToolActivities:
)
@activity.defn
def agent_toolPlanner(self, input: ToolPromptInput) -> dict:
if self.llm_provider == "ollama":
return self.prompt_llm_ollama(input)
elif self.llm_provider == "google":
return self.prompt_llm_google(input)
elif self.llm_provider == "anthropic":
return self.prompt_llm_anthropic(input)
elif self.llm_provider == "deepseek":
return self.prompt_llm_deepseek(input)
else:
return self.prompt_llm_openai(input)
async def agent_toolPlanner(self, input: ToolPromptInput) -> dict:
messages = [
{
"role": "system",
"content": input.context_instructions
+ ". The current date is "
+ datetime.now().strftime("%B %d, %Y"),
},
{
"role": "user",
"content": input.prompt,
},
]
try:
completion_kwargs = {
"model": self.llm_model,
"messages": messages,
"api_key": self.llm_key,
}
# Add base_url if configured
if self.llm_base_url:
completion_kwargs["base_url"] = self.llm_base_url
response = completion(**completion_kwargs)
response_content = response.choices[0].message.content
activity.logger.info(f"Raw LLM response: {repr(response_content)}")
activity.logger.info(f"LLM response content: {response_content}")
activity.logger.info(f"LLM response type: {type(response_content)}")
activity.logger.info(
f"LLM response length: {len(response_content) if response_content else 'None'}"
)
# Use the new sanitize function
response_content = self.sanitize_json_response(response_content)
activity.logger.info(f"Sanitized response: {repr(response_content)}")
return self.parse_json_response(response_content)
except Exception as e:
print(f"Error in LLM completion: {str(e)}")
raise
def parse_json_response(self, response_content: str) -> dict:
"""
@@ -209,247 +164,349 @@ class ToolActivities:
print(f"Invalid JSON: {e}")
raise
def prompt_llm_openai(self, input: ToolPromptInput) -> dict:
if not self.openai_client:
api_key = os.environ.get("OPENAI_API_KEY")
if not api_key:
raise ValueError(
"OPENAI_API_KEY is not set in the environment variables but LLM_PROVIDER is 'openai'"
)
self.openai_client = OpenAI(api_key=api_key)
print("Initialized OpenAI client on demand")
messages = [
{
"role": "system",
"content": input.context_instructions
+ ". The current date is "
+ datetime.now().strftime("%B %d, %Y"),
},
{
"role": "user",
"content": input.prompt,
},
]
chat_completion = self.openai_client.chat.completions.create(
model="gpt-4o", messages=messages # was gpt-4-0613
)
response_content = chat_completion.choices[0].message.content
print(f"ChatGPT response: {response_content}")
# Use the new sanitize function
response_content = self.sanitize_json_response(response_content)
return self.parse_json_response(response_content)
def prompt_llm_ollama(self, input: ToolPromptInput) -> dict:
# If not yet initialized, try to do so now (this is a backup if warm_up_ollama wasn't called or failed)
if not self.ollama_initialized:
print(
"Ollama model not pre-loaded. Loading now (this may take 30+ seconds)..."
)
try:
self.warm_up_ollama()
except Exception:
# We already logged the error in warm_up_ollama, continue with the actual request
pass
model_name = self.ollama_model_name or os.environ.get(
"OLLAMA_MODEL_NAME", "qwen2.5:14b"
)
messages = [
{
"role": "system",
"content": input.context_instructions
+ ". The current date is "
+ get_current_date_human_readable(),
},
{
"role": "user",
"content": input.prompt,
},
]
try:
response: ChatResponse = chat(model=model_name, messages=messages)
print(f"Chat response: {response.message.content}")
# Use the new sanitize function
response_content = self.sanitize_json_response(response.message.content)
return self.parse_json_response(response_content)
except (json.JSONDecodeError, ValueError) as e:
# Re-raise JSON-related exceptions to let Temporal retry the activity
print(f"JSON parsing error with Ollama response: {str(e)}")
raise
except Exception as e:
# Log and raise other exceptions that may need retrying
print(f"Error in Ollama chat: {str(e)}")
raise
def prompt_llm_google(self, input: ToolPromptInput) -> dict:
if not self.genai_configured:
api_key = os.environ.get("GOOGLE_API_KEY")
if not api_key:
raise ValueError(
"GOOGLE_API_KEY is not set in the environment variables but LLM_PROVIDER is 'google'"
)
genai.configure(api_key=api_key)
self.genai_configured = True
print("Configured Google Generative AI on demand")
model = genai.GenerativeModel(
"models/gemini-1.5-flash",
system_instruction=input.context_instructions
+ ". The current date is "
+ datetime.now().strftime("%B %d, %Y"),
)
response = model.generate_content(input.prompt)
response_content = response.text
print(f"Google Gemini response: {response_content}")
# Use the new sanitize function
response_content = self.sanitize_json_response(response_content)
return self.parse_json_response(response_content)
def prompt_llm_anthropic(self, input: ToolPromptInput) -> dict:
if not self.anthropic_client:
api_key = os.environ.get("ANTHROPIC_API_KEY")
if not api_key:
raise ValueError(
"ANTHROPIC_API_KEY is not set in the environment variables but LLM_PROVIDER is 'anthropic'"
)
self.anthropic_client = anthropic.Anthropic(api_key=api_key)
print("Initialized Anthropic client on demand")
response = self.anthropic_client.messages.create(
model="claude-3-5-sonnet-20241022", # todo try claude-3-7-sonnet-20250219
max_tokens=1024,
system=input.context_instructions
+ ". The current date is "
+ get_current_date_human_readable(),
messages=[
{
"role": "user",
"content": input.prompt,
}
],
)
response_content = response.content[0].text
print(f"Anthropic response: {response_content}")
# Use the new sanitize function
response_content = self.sanitize_json_response(response_content)
return self.parse_json_response(response_content)
def prompt_llm_deepseek(self, input: ToolPromptInput) -> dict:
if not self.deepseek_client:
api_key = os.environ.get("DEEPSEEK_API_KEY")
if not api_key:
raise ValueError(
"DEEPSEEK_API_KEY is not set in the environment variables but LLM_PROVIDER is 'deepseek'"
)
self.deepseek_client = deepseek.DeepSeekAPI(api_key=api_key)
print("Initialized DeepSeek client on demand")
messages = [
{
"role": "system",
"content": input.context_instructions
+ ". The current date is "
+ datetime.now().strftime("%B %d, %Y"),
},
{
"role": "user",
"content": input.prompt,
},
]
response = self.deepseek_client.chat_completion(prompt=messages)
response_content = response
print(f"DeepSeek response: {response_content}")
# Use the new sanitize function
response_content = self.sanitize_json_response(response_content)
return self.parse_json_response(response_content)
def sanitize_json_response(self, response_content: str) -> str:
"""
Extracts the JSON block from the response content as a string.
Supports:
- JSON surrounded by ```json and ```
- Raw JSON input
- JSON preceded or followed by extra text
Rejects invalid input that doesn't contain JSON.
Sanitizes the response content to ensure it's valid JSON.
"""
# Remove any markdown code block markers
response_content = response_content.replace("```json", "").replace("```", "")
# Remove any leading/trailing whitespace
response_content = response_content.strip()
return response_content
@activity.defn
async def get_wf_env_vars(self, input: EnvLookupInput) -> EnvLookupOutput:
"""gets env vars for workflow as an activity result so it's deterministic
handles default/None
"""
output: EnvLookupOutput = EnvLookupOutput(
show_confirm=input.show_confirm_default, multi_goal_mode=False
)
show_confirm_value = os.getenv(input.show_confirm_env_var_name)
if show_confirm_value is None:
output.show_confirm = input.show_confirm_default
elif show_confirm_value is not None and show_confirm_value.lower() == "false":
output.show_confirm = False
else:
output.show_confirm = True
first_goal_value = os.getenv("AGENT_GOAL")
if first_goal_value is None:
output.multi_goal_mode = False # default to single agent mode if unset
elif (
first_goal_value is not None
and first_goal_value.lower() == "goal_choose_agent_type"
):
output.multi_goal_mode = True
else:
output.multi_goal_mode = False
return output
@activity.defn
async def mcp_tool_activity(
self, tool_name: str, tool_args: Dict[str, Any]
) -> Dict[str, Any]:
"""MCP Tool - now using pooled connections"""
activity.logger.info(f"Executing MCP tool: {tool_name} with args: {tool_args}")
# Extract server definition
server_definition = tool_args.pop("server_definition", None)
if self.mcp_client_manager:
# Use pooled connection
return await self._execute_mcp_tool_pooled(
tool_name, tool_args, server_definition
)
else:
# Fallback to original implementation
return await _execute_mcp_tool(tool_name, tool_args, server_definition)
async def _execute_mcp_tool_pooled(
self,
tool_name: str,
tool_args: Dict[str, Any],
server_definition: MCPServerDefinition | Dict[str, Any] | None,
) -> Dict[str, Any]:
"""Execute MCP tool using pooled client connection"""
activity.logger.info(f"Executing MCP tool with pooled connection: {tool_name}")
# Convert argument types for MCP tools
converted_args = _convert_args_types(tool_args)
try:
start_marker = "```json"
end_marker = "```"
# Get pooled client
client = await self.mcp_client_manager.get_client(server_definition)
json_str = None
# Call the tool using existing client session
result = await client.call_tool(tool_name, arguments=converted_args)
normalized_result = _normalize_result(result)
# Case 1: JSON surrounded by markers
if start_marker in response_content and end_marker in response_content:
json_start = response_content.index(start_marker) + len(start_marker)
json_end = response_content.index(end_marker, json_start)
json_str = response_content[json_start:json_end].strip()
# Case 2: Text with valid JSON
else:
# Try to locate the JSON block by scanning for the first `{` and last `}`
json_start = response_content.find("{")
json_end = response_content.rfind("}")
if json_start != -1 and json_end != -1 and json_start < json_end:
json_str = response_content[json_start : json_end + 1].strip()
# Validate and ensure the extracted JSON is valid
if json_str:
json.loads(json_str) # This will raise an error if the JSON is invalid
return json_str
# If no valid JSON found, raise an error
raise ValueError("Response does not contain valid JSON.")
except json.JSONDecodeError:
# Invalid JSON
print(f"Invalid JSON detected in response: {response_content}")
raise ValueError("Response does not contain valid JSON.")
return {
"tool": tool_name,
"success": True,
"content": normalized_result,
}
except Exception as e:
# Other errors
print(f"Error processing response: {str(e)}")
print(f"Full response: {response_content}")
raise
def get_current_date_human_readable():
"""
Returns the current date in a human-readable format.
Example: Wednesday, January 1, 2025
"""
from datetime import datetime
return datetime.now().strftime("%A, %B %d, %Y")
activity.logger.error(f"MCP tool {tool_name} failed: {str(e)}")
return {
"tool": tool_name,
"success": False,
"error": str(e),
"error_type": type(e).__name__,
}
@activity.defn(dynamic=True)
def dynamic_tool_activity(args: Sequence[RawValue]) -> dict:
async def dynamic_tool_activity(args: Sequence[RawValue]) -> dict:
from tools import get_handler
tool_name = activity.info().activity_type # e.g. "FindEvents"
tool_args = activity.payload_converter().from_payload(args[0].payload, dict)
activity.logger.info(f"Running dynamic tool '{tool_name}' with args: {tool_args}")
# Delegate to the relevant function
handler = get_handler(tool_name)
result = handler(tool_args)
# Check if this is an MCP tool call by looking for server_definition in args
server_definition = tool_args.pop("server_definition", None)
# Optionally log or augment the result
activity.logger.info(f"Tool '{tool_name}' result: {result}")
if server_definition:
# This is an MCP tool call - handle it directly
activity.logger.info(f"Executing MCP tool: {tool_name}")
return await _execute_mcp_tool(tool_name, tool_args, server_definition)
else:
# This is a regular tool - delegate to the relevant function
handler = get_handler(tool_name)
if inspect.iscoroutinefunction(handler):
result = await handler(tool_args)
else:
result = handler(tool_args)
# Optionally log or augment the result
activity.logger.info(f"Tool '{tool_name}' result: {result}")
return result
# MCP Client Activities
def _build_connection(
server_definition: MCPServerDefinition | Dict[str, Any] | None,
) -> Dict[str, Any]:
"""Build connection parameters from MCPServerDefinition or dict"""
if server_definition is None:
# Default to stdio connection with the main server
return {"type": "stdio", "command": "python", "args": ["server.py"], "env": {}}
# Handle both MCPServerDefinition objects and dicts (from Temporal serialization)
if isinstance(server_definition, dict):
return {
"type": server_definition.get("connection_type", "stdio"),
"command": server_definition.get("command", "python"),
"args": server_definition.get("args", ["server.py"]),
"env": server_definition.get("env", {}) or {},
}
return {
"type": server_definition.connection_type,
"command": server_definition.command,
"args": server_definition.args,
"env": server_definition.env or {},
}
def _normalize_result(result: Any) -> Any:
"""Normalize MCP tool result for serialization"""
if hasattr(result, "content"):
# Handle MCP result objects
if hasattr(result.content, "__iter__") and not isinstance(result.content, str):
return [
item.text if hasattr(item, "text") else str(item)
for item in result.content
]
return str(result.content)
return result
def _convert_args_types(tool_args: Dict[str, Any]) -> Dict[str, Any]:
"""Convert string arguments to appropriate types for MCP tools"""
converted_args = {}
for key, value in tool_args.items():
if key == "server_definition":
# Skip server_definition - it's metadata
continue
if isinstance(value, str):
# Try to convert string values to appropriate types
if value.isdigit():
# Convert numeric strings to integers
converted_args[key] = int(value)
elif value.replace(".", "").isdigit() and value.count(".") == 1:
# Convert decimal strings to floats
converted_args[key] = float(value)
elif value.lower() in ("true", "false"):
# Convert boolean strings
converted_args[key] = value.lower() == "true"
else:
# Keep as string
converted_args[key] = value
else:
# Keep non-string values as-is
converted_args[key] = value
return converted_args
async def _execute_mcp_tool(
tool_name: str,
tool_args: Dict[str, Any],
server_definition: MCPServerDefinition | Dict[str, Any] | None,
) -> Dict[str, Any]:
"""Execute an MCP tool with the given arguments and server definition"""
activity.logger.info(f"Executing MCP tool: {tool_name}")
# Convert argument types for MCP tools
converted_args = _convert_args_types(tool_args)
connection = _build_connection(server_definition)
try:
if connection["type"] == "stdio":
# Handle stdio connection
async with _stdio_connection(
command=connection.get("command", "python"),
args=connection.get("args", ["server.py"]),
env=connection.get("env", {}),
) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the session
activity.logger.info(f"Initializing MCP session for {tool_name}")
await session.initialize()
activity.logger.info(f"MCP session initialized for {tool_name}")
# Call the tool
activity.logger.info(
f"Calling MCP tool {tool_name} with args: {converted_args}"
)
try:
result = await session.call_tool(
tool_name, arguments=converted_args
)
activity.logger.info(
f"MCP tool {tool_name} returned result: {result}"
)
except Exception as tool_exc:
activity.logger.error(
f"MCP tool {tool_name} call failed: {type(tool_exc).__name__}: {tool_exc}"
)
raise
normalized_result = _normalize_result(result)
activity.logger.info(f"MCP tool {tool_name} completed successfully")
return {
"tool": tool_name,
"success": True,
"content": normalized_result,
}
elif connection["type"] == "tcp":
# Handle TCP connection (placeholder for future implementation)
raise ApplicationError("TCP connections not yet implemented")
else:
raise ApplicationError(f"Unsupported connection type: {connection['type']}")
except Exception as e:
activity.logger.error(f"MCP tool {tool_name} failed: {str(e)}")
# Return error information
return {
"tool": tool_name,
"success": False,
"error": str(e),
"error_type": type(e).__name__,
}
@asynccontextmanager
async def _stdio_connection(command: str, args: list, env: dict):
"""Create stdio connection to MCP server"""
if stdio_client is None:
raise ApplicationError("MCP client libraries not available")
# Create server parameters
server_params = StdioServerParameters(command=command, args=args, env=env)
async with stdio_client(server_params) as (read, write):
yield read, write
@activity.defn
async def mcp_list_tools(
server_definition: MCPServerDefinition, include_tools: Optional[List[str]] = None
) -> Dict[str, Any]:
"""List available MCP tools from the specified server"""
activity.logger.info(f"Listing MCP tools for server: {server_definition.name}")
connection = _build_connection(server_definition)
try:
if connection["type"] == "stdio":
async with _stdio_connection(
command=connection.get("command", "python"),
args=connection.get("args", ["server.py"]),
env=connection.get("env", {}),
) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the session
await session.initialize()
# List available tools
tools_response = await session.list_tools()
# Process tools based on include_tools filter
tools_info = {}
for tool in tools_response.tools:
# If include_tools is specified, only include those tools
if include_tools is None or tool.name in include_tools:
tools_info[tool.name] = {
"name": tool.name,
"description": tool.description,
"inputSchema": (
tool.inputSchema.model_dump()
if hasattr(tool.inputSchema, "model_dump")
else str(tool.inputSchema)
),
}
activity.logger.info(
f"Found {len(tools_info)} tools for server {server_definition.name}"
)
return {
"server_name": server_definition.name,
"success": True,
"tools": tools_info,
"total_available": len(tools_response.tools),
"filtered_count": len(tools_info),
}
elif connection["type"] == "tcp":
raise ApplicationError("TCP connections not yet implemented")
else:
raise ApplicationError(f"Unsupported connection type: {connection['type']}")
except Exception as e:
activity.logger.error(
f"Failed to list tools for server {server_definition.name}: {str(e)}"
)
return {
"server_name": server_definition.name,
"success": False,
"error": str(e),
"error_type": type(e).__name__,
}

View File

@@ -1,18 +1,18 @@
from fastapi import FastAPI
from typing import Optional
from temporalio.client import Client
from temporalio.exceptions import TemporalError
from temporalio.api.enums.v1 import WorkflowExecutionStatus
from fastapi import HTTPException
from dotenv import load_dotenv
import asyncio
import os
from typing import Optional
from workflows.agent_goal_workflow import AgentGoalWorkflow
from models.data_types import CombinedInput, AgentGoalWorkflowParams
from tools.goal_registry import goal_match_train_invoice, goal_event_flight_invoice
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from shared.config import get_temporal_client, TEMPORAL_TASK_QUEUE
from temporalio.api.enums.v1 import WorkflowExecutionStatus
from temporalio.client import Client
from temporalio.exceptions import TemporalError
from goals import goal_list
from models.data_types import AgentGoalWorkflowParams, CombinedInput
from shared.config import TEMPORAL_TASK_QUEUE, get_temporal_client
from workflows.agent_goal_workflow import AgentGoalWorkflow
app = FastAPI()
temporal_client: Optional[Client] = None
@@ -21,14 +21,14 @@ temporal_client: Optional[Client] = None
load_dotenv()
def get_agent_goal():
def get_initial_agent_goal():
"""Get the agent goal from environment variables."""
goal_name = os.getenv("AGENT_GOAL", "goal_match_train_invoice")
goals = {
"goal_match_train_invoice": goal_match_train_invoice,
"goal_event_flight_invoice": goal_event_flight_invoice,
}
return goals.get(goal_name, goal_event_flight_invoice)
env_goal = os.getenv(
"AGENT_GOAL", "goal_event_flight_invoice"
) # if no goal is set in the env file, default to single agent mode
for listed_goal in goal_list:
if listed_goal.id == env_goal:
return listed_goal
@app.on_event("startup")
@@ -65,7 +65,7 @@ async def get_tool_data():
return {}
# Query the workflow
tool_data = await handle.query("get_tool_data")
tool_data = await handle.query("get_latest_tool_data")
return tool_data
except TemporalError as e:
# Workflow not found; return an empty response
@@ -113,10 +113,36 @@ async def get_conversation_history():
status_code=404, detail="Workflow worker unavailable or not found."
)
# For other Temporal errors, return a 500
raise HTTPException(
status_code=500, detail="Internal server error while querying workflow."
)
if "workflow not found" in error_message:
await start_workflow()
return []
else:
# For other Temporal errors, return a 500
raise HTTPException(
status_code=500, detail="Internal server error while querying workflow."
)
@app.get("/agent-goal")
async def get_agent_goal():
"""Calls the workflow's 'get_agent_goal' query."""
try:
# Get workflow handle
handle = temporal_client.get_workflow_handle("agent-workflow")
# Check if the workflow is completed
workflow_status = await handle.describe()
if workflow_status.status == 2:
# Workflow is completed; return an empty response
return {}
# Query the workflow
agent_goal = await handle.query("get_agent_goal")
return agent_goal
except TemporalError as e:
# Workflow not found; return an empty response
print(e)
return {}
@app.post("/send-prompt")
@@ -124,7 +150,8 @@ async def send_prompt(prompt: str):
# Create combined input with goal from environment
combined_input = CombinedInput(
tool_params=AgentGoalWorkflowParams(None, None),
agent_goal=get_agent_goal(),
agent_goal=get_initial_agent_goal(),
# change to get from workflow query
)
workflow_id = "agent-workflow"
@@ -168,13 +195,12 @@ async def end_chat():
@app.post("/start-workflow")
async def start_workflow():
# Get the configured goal
agent_goal = get_agent_goal()
initial_agent_goal = get_initial_agent_goal()
# Create combined input
combined_input = CombinedInput(
tool_params=AgentGoalWorkflowParams(None, None),
agent_goal=agent_goal,
agent_goal=initial_agent_goal,
)
workflow_id = "agent-workflow"
@@ -186,9 +212,9 @@ async def start_workflow():
id=workflow_id,
task_queue=TEMPORAL_TASK_QUEUE,
start_signal="user_prompt",
start_signal_args=["### " + agent_goal.starter_prompt],
start_signal_args=["### " + initial_agent_goal.starter_prompt],
)
return {
"message": f"Workflow started with goal's starter prompt: {agent_goal.starter_prompt}."
"message": f"Workflow started with goal's starter prompt: {initial_agent_goal.starter_prompt}."
}

BIN
assets/0.2.0_changes.jpeg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 398 KiB

View File

Before

Width:  |  Height:  |  Size: 376 KiB

After

Width:  |  Height:  |  Size: 376 KiB

Binary file not shown.

View File

@@ -1,50 +0,0 @@
{
"$schema": "https://cdn.statically.io/gh/nguyenngoclongdev/cdn/main/schema/v10/terminal-keeper.json",
"theme": "tribe",
"active": "default",
"activateOnStartup": false,
"keepExistingTerminals": false,
"sessions": {
"default": [
[
{
"name": "frontend",
"autoExecuteCommands": true,
"commands": [
"cd frontend && npx vite"
]
},
{
"name": "uvicorn",
"autoExecuteCommands": true,
"commands": [
"poetry run uvicorn api.main:app --reload"
]
}
],
[
{
"name": "agent worker",
"autoExecuteCommands": true,
"commands": [
"poetry run python scripts/run_worker.py"
]
},
{
"name": "trains worker",
"autoExecuteCommands": true,
"commands": [
"poetry run python scripts/run_legacy_worker.py"
]
}
],
{
"name": "trains_api",
"autoExecuteCommands": true,
"commands": [
"poetry run python thirdparty/train_api.py"
]
}
]
}
}

View File

@@ -0,0 +1,20 @@
services:
api:
volumes:
- ./:/app:cached
command: uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload
worker:
volumes:
- ./:/app:cached
command: python scripts/run_worker.py
train-api:
volumes:
- ./:/app:cached
command: python thirdparty/train_api.py
frontend:
volumes:
- ./frontend:/app:cached
command: sh -c "apk update && apk add --no-cache xdg-utils && npm install && npx vite --host 0.0.0.0 --port 5173"

120
docker-compose.yml Normal file
View File

@@ -0,0 +1,120 @@
services:
# Database service
postgresql:
image: postgres:14
container_name: temporal-postgresql
environment:
POSTGRES_USER: temporal
POSTGRES_PASSWORD: temporal
POSTGRES_DB: temporal
volumes:
- postgresql:/var/lib/postgresql/data
networks:
- temporal-network
# Temporal services
temporal:
image: temporalio/auto-setup:1.27.2
container_name: temporal
ports:
- "7233:7233"
environment:
- DB=postgres12
- DB_PORT=5432
- POSTGRES_USER=temporal
- POSTGRES_PWD=temporal
- POSTGRES_SEEDS=postgresql
depends_on:
- postgresql
networks:
- temporal-network
temporal-admin-tools:
image: temporalio/admin-tools:1.27
container_name: temporal-admin-tools
depends_on:
- temporal
environment:
- TEMPORAL_CLI_ADDRESS=temporal:7233
networks:
- temporal-network
temporal-ui:
image: temporalio/ui:2.37.2
container_name: temporal-ui
ports:
- "8080:8080"
environment:
- TEMPORAL_ADDRESS=temporal:7233
- TEMPORAL_CORS_ORIGINS=http://localhost:8080
depends_on:
- temporal
networks:
- temporal-network
api:
build:
context: .
dockerfile: Dockerfile
container_name: temporal-ai-agent-api
ports:
- "8000:8000"
depends_on:
- temporal
networks:
- temporal-network
env_file:
- .env
environment:
- TEMPORAL_ADDRESS=temporal:7233
worker:
build:
context: .
dockerfile: Dockerfile
container_name: temporal-ai-agent-worker
depends_on:
- temporal
env_file:
- .env
environment:
- TEMPORAL_ADDRESS=temporal:7233
command: python scripts/run_worker.py
networks:
- temporal-network
train-api:
build:
context: .
dockerfile: Dockerfile
container_name: temporal-ai-agent-train-api
depends_on:
- temporal
env_file:
- .env
environment:
- TEMPORAL_ADDRESS=temporal:7233
command: python thirdparty/train_api.py
networks:
- temporal-network
frontend:
image: node:18-alpine
container_name: temporal-ai-agent-frontend
working_dir: /app
volumes:
- ./frontend:/app
command: sh -c "apk update && apk add --no-cache xdg-utils && npm install && npx vite --host 0.0.0.0"
ports:
- "5173:5173"
depends_on:
- api
networks:
- temporal-network
networks:
temporal-network:
driver: bridge
volumes:
postgresql:

10
docs/README.md Normal file
View File

@@ -0,0 +1,10 @@
# Documentation Index
- **architecture.md** - Overview of system components and how they interact.
- **architecture-decisions.md** - Rationale behind key design choices.
- **changelog.md** - Project history and notable changes.
- **contributing.md** - How to contribute and run tests.
- **setup.md** - Installation and configuration instructions.
- **testing.md** - Commands for running the test suite.
- **adding-goals-and-tools.md** - Guide to extending the agent with new goals and tools.
- **todo.md** - Planned enhancements and future work.

View File

@@ -0,0 +1,177 @@
# Customizing the Agent
The agent operates in single-agent mode by default, focusing on one specific goal. It also supports an experimental multi-agent mode where users can have multiple agents, each with their own goal, and supports switching back to choosing a new goal at the end of every successful goal (or even mid-goal).
A goal can use two types of tools:
- **Native Tools**: Custom tools implemented directly in the codebase (in `/tools/`)
- **MCP Tools**: External tools accessed via Model Context Protocol (MCP) servers
It may be helpful to review the [architecture](./architecture.md) for a guide and definition of goals, tools, etc.
## Adding a New Goal Category
Goal Categories lets you pick which groups of goals to show in multi-agent mode. Set via an .env setting, `GOAL_CATEGORIES`.
Even if you don't intend to use the goal in a multi-agent scenario, goal categories are useful for organization and discovery.
1. Pick a unique one that has some business meaning
2. Use it in your [.env](./.env) file
3. Add to [.env.example](./.env.example)
4. Use it in your Goal definition, see below.
## Adding a Goal
1. Create a new Python file in the `/goals/` directory (e.g., `goals/my_category.py`) - these files contain descriptions of goals and the tools used to achieve them
2. Pick a name for your goal! (such as "goal_hr_schedule_pto")
3. Fill out the required elements:
- `id`: needs to be the same as the name
- `agent_name`: user-facing name for the agent/chatbot
- `category_tag`: category for the goal
- `agent_friendly_description`: user-facing description of what the agent/chatbot does
- `tools`: the list of **native tools** the goal uses. These are defined in [tools/tool_registry.py](tools/tool_registry.py) as `tool_registry.[name_of_tool]`
Example:
```python
tools=[
tool_registry.current_pto_tool,
tool_registry.future_pto_calc_tool,
tool_registry.book_pto_tool,
]
```
- `mcp_server_definition`: (Optional) MCP server configuration for external tools. Can use predefined configurations from `shared/mcp_config.py` or define custom ones. See [MCP Tools section](#adding-mcp-tools) below.
- `description`: LLM-facing description of the goal that lists all tools (native and MCP) by name and purpose.
- `starter_prompt`: LLM-facing first prompt given to begin the scenario. This field can contain instructions that are different from other goals, like "begin by providing the output of the first tool" rather than waiting on user confirmation. (See [goal_choose_agent_type](tools/goal_registry.py) for an example.)
- `example_conversation_history`: LLM-facing sample conversation/interaction regarding the goal. See the existing goals for how to structure this.
4. Add your new goal to a list variable (e.g., `my_category_goals: List[AgentGoal] = [your_super_sweet_new_goal]`)
5. Import and extend the goal list in `goals/__init__.py` by adding:
- Import: `from goals.my_category import my_category_goals`
- Extend: `goal_list.extend(my_category_goals)`
## Adding Native Tools
Native tools are custom implementations that run directly in your codebase. Use these for business logic specific to your application.
### Note on Optional Tools
Tools can be optional - you can indicate this in the tool listing of goal description (see above section re: goal registry) by adding something like, "This step is optional and can be skipped by moving to the next tool." Here is an example from an older iteration of the `goal_hr_schedule_pto` goal, when it was going to have an optional step to check for existing calendar conflicts:
```
description="Help the user gather args for these tools in order: "
"1. CurrentPTO: Tell the user how much PTO they currently have "
"2. FuturePTO: Tell the user how much PTO they will have as of the prospective date "
"3. CalendarConflict: Tell the user what conflicts if any exist around the prospective date on a list of calendars. This step is optional and can be skipped by moving to the next tool. "
"4. BookPTO: Book PTO "
```
Tools should generally return meaningful information and be generally failsafe in returning a useful result based on input.
(If you're doing a local data approach like those in [.tools/data/](./tools/data/)) it's good to document how they can be setup to get a good result in tool specific [setup](./setup.md).
### Add to Tool Registry
1. Open [/tools/tool_registry.py](tools/tool_registry.py) - this file contains mapping of tool names to tool definitions (so the AI understands how to use them)
2. Define the tool
- `name`: name of the tool - this is the name as defined in the goal description list of tools. The name should be (sort of) the same as the tool name given in the goal description. So, if the description lists "CurrentPTO" as a tool, the name here should be `current_pto_tool`.
- `description`: LLM-facing description of tool
- `arguments`: These are the _input_ arguments to the tool. Each input argument should be defined as a [ToolArgument](./models/tool_definitions.py). Tools don't have to have arguments but the arguments list has to be declared. If the tool you're creating doesn't have inputs, define arguments as `arguments=[]`
### Create Each Native Tool Implementation
- The tools themselves are defined in their own files in `/tools` - you can add a subfolder to organize them, see the hr tools for an example.
- The file name and function name will be the same as each other and should also be the same as the name of the tool, without "tool" - so `current_pto_tool` would be `current_pto.py` with a function named `current_pto` within it.
- The function should have `args: dict` as the input and also return a `dict`
- The return dict should match the output format you specified in the goal's `example_conversation_history`
- tools are where the user input+model output becomes deterministic. Add validation here to make sure what the system is doing is valid and acceptable
### Add to `tools/__init__.py` and the tool get_handler()
- In [tools/__init__.py](./tools/__init__.py), add an import statement for each new native tool as well as an applicable return statement in `get_handler`. The tool name here should match the tool name as described in the goal's `description` field.
Example:
```python
if tool_name == "CurrentPTO":
return current_pto
```
### Update workflow_helpers.py
- Add your new native tool to the static tools list in [workflows/workflow_helpers.py](workflows/workflow_helpers.py) so it's correctly identified as a native tool rather than an MCP tool.
## Adding MCP Tools
MCP (Model Context Protocol) tools are external tools provided by MCP servers. They're useful for integrating with third-party services like Stripe, databases, or APIs without implementing custom code.
### Configure MCP Server Definition
You can either use predefined MCP server configurations from `shared/mcp_config.py` or define custom ones.
#### Using Predefined Configurations
```python
from shared.mcp_config import get_stripe_mcp_server_definition
# In your goal definition:
mcp_server_definition=get_stripe_mcp_server_definition(included_tools=["list_products", "create_customer"])
```
#### Custom MCP Server Definition
Add an `mcp_server_definition` to your goal:
```python
mcp_server_definition=MCPServerDefinition(
name="stripe-mcp",
command="npx",
args=[
"-y",
"@stripe/mcp",
"--tools=all",
f"--api-key={os.getenv('STRIPE_API_KEY')}",
],
env=None,
included_tools=[
"list_products",
"list_prices",
"create_customer",
"create_invoice",
"create_payment_link",
],
)
```
### MCP Tool Configuration
- `name`: Identifier for the MCP server
- `command`: Command to start the MCP server (e.g., "npx", "python")
- `args`: Arguments to pass to the command
- `env`: Environment variables for the server (optional)
- `included_tools`: List of specific tools to use from the server (optional - if omitted, all tools are included)
### How MCP Tools Work
- MCP tools are automatically loaded when the workflow starts
- They're dynamically converted to `ToolDefinition` objects
- The system automatically routes MCP tool calls to the appropriate MCP server
- No additional code implementation needed - just configuration
## Tool Confirmation
There are three ways to manage confirmation of tool runs:
1. Arguments confirmation box - confirm tool arguments and execution with a button click
- Can be disabled by env setting: `SHOW_CONFIRM=FALSE`
2. Soft prompt confirmation via asking the model to prompt for confirmation: “Are you ready to be invoiced for the total cost of the train tickets?” in the [goal_registry](./tools/goal_registry.py).
3. Hard confirmation requirement as a tool argument. See for example the PTO Scheduling Tool:
```Python
ToolArgument(
name="userConfirmation",
type="string",
description="Indication of user's desire to book PTO",
),
```
If you really want to wait for user confirmation, record it on the workflow (as a Signal) and not rely on the LLM to probably get it, use option #3.
I recommend exploring all three. For a demo, I would decide if you want the Arguments confirmation in the UI, and if not I'd generally go with option #2 but use #3 for tools that make business sense to confirm, e.g. those tools that take action/write data.
## Add a Goal & Tools Checklist
### For All Goals:
- [ ] Create goal file in `/goals/` directory (e.g., `goals/my_category.py`)
- [ ] Add goal to the category's goal list in the file
- [ ] Import and extend the goal list in `goals/__init__.py`
- [ ] If a new category, add Goal Category to [.env](./.env) and [.env.example](./.env.example)
### For Native Tools:
- [ ] Add native tools to [tool_registry.py](tools/tool_registry.py)
- [ ] Implement tool functions in `/tools/` directory
- [ ] Add tools to [tools/__init__.py](tools/__init__.py) in the `get_handler()` function
- [ ] Add tool names to static tools list in [workflows/workflow_helpers.py](workflows/workflow_helpers.py)
### For MCP Tools:
- [ ] Add `mcp_server_definition` to your goal configuration (use `shared/mcp_config.py` for common servers)
- [ ] Ensure MCP server is available and properly configured
- [ ] Set required environment variables (API keys, etc.)
- [ ] Test MCP server connectivity before running the agent
- [ ] If creating new MCP server configs, add them to `shared/mcp_config.py` for reuse
And that's it! Happy AI Agent building!

View File

@@ -0,0 +1,33 @@
# Architecture Decisions
This documents some of the "why" behind the [architecture](./architecture.md).
## AI Models
We wanted to have flexibility to use different models, because this space is changing rapidly and models get better regularly.
Also, for you, we wanted to let you pick your model of choice. The system is designed to make changing models out simple. For how to do that, checkout the [setup guide](./setup.md).
## Temporal
We asked one of the AI models used in this demo to answer this question (edited minorly):
### Reliability and State Management:
Temporal ensures durability and fault tolerance, which are critical for agentic AI systems that involve long-running, complex workflows. For example, it preserves application state across failures, allowing AI agents to resume from where they left off without losing progress. Major AI companies use this for research experiments and agentic flows, where reliability is essential for continuous exploration.
### Handling Complex, Dynamic Workflows:
Agentic AI often involves unpredictable, multi-step processes like web crawling or data searching. Temporals workflow orchestration simplifies managing these tasks by abstracting complexity, providing features like retries, timeouts, and signals/queries. Temporal makes observability and resuming failed complex experiments and deep searches simple.
### Scalability and Speed:
Temporal enables rapid development and scaling, crucial for AI systems handling large-scale experiments or production workloads. AI model deployment and SRE teams use it to get code to production quickly with scale as a focus, while research teams can (and do!) run hundreds of experiments daily. Temporal customers report a significant reduction in development time (e.g., 20 weeks to 2 weeks for a feature).
### Observability and Debugging:
Agentic AI systems need insight into where processes succeed or fail. Temporal provides end-to-end visibility and durable workflow history, which Temporal customers are using to track agentic flows and understand failure points.
### Simplified Error Handling:
Temporal abstracts failure management (e.g., retries, rollbacks) so developers can focus on AI logic rather than "plumbing" code. This is vital for agentic AI, where external interactions (e.g., APIs, data sources) are prone to failure.
### Flexibility for Experimentation:
For research-heavy agentic AI, Temporal supports dynamic, code-first workflows and easy integration of new signals/queries, aligning with researchers needs to iterate quickly on experimental paths.
In essence, Temporals value lies in its ability to make agentic AI systems more reliable, scalable, and easier to develop by handling the underlying complexity of distributed workflows for both research and applied AI tasks.
Temporal was built to solve the problems of distributed computing, including scalability, reliability, security, visibility, and complexity. Agentic AI systems are complex distributed systems, so Temporal should fit well. Scaling, security, and productionalization are major pain points in March 2025 for building agentic systems.
In this system Temporal lets you:
- Orchestrate interactions across distributed data stores and tools <br />
- Hold state, potentially over long periods of time <br />
- Ability to self-heal and retry until the (probabilistic) LLM returns valid data <br />
- Support for human intervention such as approvals <br />
- Parallel processing for efficiency of data retrieval and tool use <br />

74
docs/architecture.md Normal file
View File

@@ -0,0 +1,74 @@
# Elements
These are the main elements of this system. See [architecture decisions](./architecture-decisions.md) for information beind these choices.
In this document we will explain each element and their interactions, and then connect them all at the end.
<img src="./assets/Architecture_elements.png" width="50%" alt="Architecture Elements">
## Workflow
This is a [Temporal Workflow](https://docs.temporal.io/workflows) - a durable straightforward description of the process to be executed. See [agent_goal_workflow.py](./workflows/agent_goal_workflow.py).
Temporal is used to make the process scalable, durable, reliable, secure, and visible.
### Workflow Responsibilities:
- Orchestrates interactive loops:
- LLM Loop: Prompts LLM, durably executes LLM, stores responses
- Interactive Loop: Elicits responses from input (in our case a human) and validates input responses
- Tool Execution Loop: Durably executes Tools
- Keeps record of all interactions ([Signals, Queries, Updates](https://docs.temporal.io/develop/python/message-passing))
- Handles failures gracefully
- Input, LLM and Tool interaction history stored for debugging and analysis
## Activities
These are [Temporal Activities](https://docs.temporal.io/activities). Defined as simple functions, they are auto-retried async/event driven behind the scenes. Activities durably execute Tools and the LLM. See [a sample activity](./activities/tool_activities.py).
## Tools
Tools define the capabilities of the system. They are simple Python functions (could be in any language as Temporal supports multiple languages).
They are executed by Temporal Activities. They are “just code” - can connect to any API or system. They also are where the deterministic business logic is: you can validate and retry actions using code you write.
Failures are handled gracefully by Temporal.
Activities + Tools turn the probabalistic input from the user and LLM into deterministic action.
## Prompts
Prompts are where the instructions to the LLM are. Prompts are made up of initial instructions, goal instructions, and tool instructions.
See [agent prompts](./prompts/agent_prompt_generators.py) and [goal & tool prompts](./tools/goal_registry.py).
This is where you can add probabalistic business logic to
- to control process flow
- describe what to do
- give examples of interactions
- give instruction and validation for the LLM
## LLM
Probabalistic execution: it will _probably_ do what you tell it to do.
Turns the guidance from the prompts (see [agent prompts](./prompts/agent_prompt_generators.py) and [goal prompts](./tools/goal_registry.py)) into
You have a choice of providers - see [setup](./setup.md).
The LLM:
- Drives toward the initial Goal and any subsequent Goals selected by user
- Decides what to do based on input, such as:
- Validates user input for Tools
- Decides when to execute Tools
- Decides on next step for Goal
- Formats input and interprets output for Tools
- is executed by Temporal Activities
- API failures and logical failures are handled transparently
## Interaction
Interaction is managed with Temporal Signals and Queries. These are durably stored in Workflow History.
History can be used for analysis and debugging. It's all “just code” so it's easy to add new Signals and Queries.
Input can be very dynamic, just needs to be serializable.
The Workflow executes the Interaction Loop: gathering input, validating input, and providing a response:
![Interaction Loop](./assets/interaction_loop.png)
Here's a more detailed example for gathering inputs for Tools:
![Tool Gathering](./assets/argument_gathering_cycle.png)
# Architecture Model
Now that we have the pieces and what they do, here is a more complete diagram of how the pieces work together:
![Architecture](./assets/ai_agent_architecture_model.png "Architecture Model")
# Adding features
Want to add more Goals and Tools? See [adding goals and tools](./adding-goals-and-tools.md). Have fun!

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 124 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 129 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

30
docs/changelog.md Normal file
View File

@@ -0,0 +1,30 @@
# Changelog
All notable changes to this project will be documented in this file.
## [0.2.0] - 2025-04-24
![0.2.0 Changes Screenshot](./assets/0.2.0_changes.jpeg)
### Added
- **Multigoal agent architecture** with dynamic goal switching (`goal_choose_agent_type`, `ListAgents`, `ChangeGoal`).
- See [the architecture guide](./architecture.md) and [setup guide](./setup.md).
- **New goal categories & agents**: HR PTO scheduling/checking, paycheck integration, Financial (balances, money movement, loan application), Ecommerce order tracking.
- See [the guide for adding goals and tools](./adding-goals-and-tools.md).
- **Force Confirmation**: `SHOW_CONFIRM` will show a confirmation box before allowing the agent to run a tool.
- **Grok (`x.ai`) LLM provider** support via `GROK_API_KEY`.
- Extensive **docs**: `setup.md`, `architecture.md`, `architecture-decisions.md`, `adding-goals-and-tools.md`, plus new diagrams & assets.
### Changed
- **UI Confirmation Box** is less 'debug' looking and prettier.
- Package renamed to **`temporal_AI_agent`** and version bumped to **0.2.0** in `pyproject.toml`.
- Environment variables changed (see `.env_example`): (`RAPIDAPI_HOST_*`, `AGENT_GOAL` defaults, `GOAL_CATEGORIES`, `SHOW_CONFIRM`, `FIN_START_REAL_WORKFLOW`).
## [0.1.0] - 2025-01-04
### Added
- **Initial release** of the Temporal AI Agent demo.
- **Single goal agent** architecture with a single goal and agent type.
- This is the agent demoed in the [YouTube video](https://www.youtube.com/watch?v=GEXllEH2XiQ).
[0.2.0]: https://github.com/temporal-community/temporal-ai-agent/pull/29

106
docs/contributing.md Normal file
View File

@@ -0,0 +1,106 @@
# Contributing to the Temporal AI Agent Project
This document provides guidelines for contributing to `temporal-ai-agent`. All setup and installation instructions can be found in [setup.md](./setup.md).
## Getting Started
### Code Style & Formatting
We use `black` for code formatting and `isort` for import sorting to maintain a consistent codebase.
- **Format code:**
```bash
poetry run poe format
```
Or manually:
```bash
poetry run black .
poetry run isort .
```
Please format your code before committing.
### Linting & Type Checking
We use `mypy` for static type checking and other linters configured via `poe the poet`.
- **Run linters and type checks:**
```bash
poetry run poe lint
```
Or manually for type checking:
```bash
poetry run mypy --check-untyped-defs --namespace-packages .
```
Ensure all linting and type checks pass before submitting a pull request.
## Testing
Comprehensive testing is crucial for this project. We use `pytest` and Temporal's testing framework.
- **Install test dependencies** (if not already done with `poetry install --with dev`):
```bash
poetry install --with dev
```
- **Run all tests:**
```bash
poetry run pytest
```
- **Run tests with time-skipping (recommended for faster execution, especially in CI):**
```bash
poetry run pytest --workflow-environment=time-skipping
```
For detailed information on test categories, running specific tests, test environments, coverage, and troubleshooting, please refer to:
- [testing.md](./testing.md) (Quick Start and overview)
- [tests/README.md](../tests/README.md) (Comprehensive guide, patterns, and best practices)
**Ensure all tests pass before submitting a pull request.**
## Making Changes
### Adding New Tools or Goals
If you're looking to extend the agent's capabilities:
1. Create your tool implementation in the `tools/` directory.
2. Register your tool and associate it with relevant goals.
For detailed instructions, please see:
- [Agent Customization in AGENTS.md](../AGENTS.md#agent-customization)
- [Adding Goals and Tools Guide](./adding-goals-and-tools.md)
### General Code Changes
- Follow the existing code style and patterns.
- Ensure any new code is well-documented with comments.
- Write new tests for new functionality or bug fixes.
- Update existing tests if necessary.
## Submitting Contributions
### Pull Requests
When you're ready to submit your changes:
1. Push your branch to the remote repository.
2. Open a Pull Request (PR) against the `main` branch.
3. **Describe your changes:** Clearly explain what you changed and why. Reference any related issues.
4. **Ensure tests pass:** All CI checks, including tests and linters, must pass. The command `poetry run pytest --workflow-environment=time-skipping` is a good one to run locally.
5. **Request review:** Request a review from one or more maintainers.
## Reporting Bugs
If you encounter a bug, please:
1. **Search existing issues:** Check if the bug has already been reported.
2. **Open a new issue:** If not, create a new issue.
- Provide a clear and descriptive title.
- Include steps to reproduce the bug.
- Describe the expected behavior and what actually happened.
- Provide details about your environment (OS, Python version, Temporal server version, etc.).
- Include any relevant logs or screenshots.
## Suggesting Enhancements
We welcome suggestions for new features or improvements!
1. **Search existing issues/discussions:** See if your idea has already been discussed.
2. **Open a new issue:**
- Use a clear and descriptive title.
- Provide a detailed explanation of the enhancement and its benefits.
- Explain the use case or problem it solves.
- Include any potential implementation ideas if you have them.
## Key Resources
- **Project Overview**: [README.md](../README.md)
- **Detailed Contribution & Development Guide**: [AGENTS.md](../AGENTS.md)
- **Setup Instructions**: [setup.md](./setup.md)
- **Comprehensive Testing Guide**: [testing.md](./testing.md) and [tests/README.md](../tests/README.md)
- **System Architecture**: [architecture.md](./architecture.md)
- **Architecture Decisions**: [architecture-decisions.md](./architecture-decisions.md)
- **Customizing Agent Tools and Goals**: [adding-goals-and-tools.md](./adding-goals-and-tools.md)
- **To-Do List / Future Enhancements**: [todo.md](./todo.md)

336
docs/setup.md Normal file
View File

@@ -0,0 +1,336 @@
# Setup Guide
## Initial Configuration
This application uses `.env` files for configuration. Copy the [.env.example](.env.example) file to `.env` and update the values:
```bash
cp .env.example .env
```
Then add API keys, configuration, as desired.
If you want to show confirmations/enable the debugging UI that shows tool args, set
```bash
SHOW_CONFIRM=True
```
We recommend setting this to `False` in most cases, as it can clutter the conversation with confirmation messages.
### Quick Start with Makefile
We've provided a Makefile to simplify the setup and running of the application. Here are the main commands:
```bash
# Initial setup
make setup # Creates virtual environment and installs dependencies
make setup-venv # Creates virtual environment only
make install # Installs all dependencies
# Running the application
make run-worker # Starts the Temporal worker
make run-api # Starts the API server
make run-frontend # Starts the frontend development server
# Additional services
make run-train-api # Starts the train API server
make run-legacy-worker # Starts the legacy worker
make run-enterprise # Builds and runs the enterprise .NET worker
# Development environment setup
make setup-temporal-mac # Installs and starts Temporal server on Mac
# View all available commands
make help
```
### Manual Setup (Alternative to Makefile)
If you prefer to run commands manually, see the sections below for detailed instructions on setting up the backend, frontend, and other components.
### Agent Goal Configuration
The agent can be configured to pursue different goals using the `AGENT_GOAL` environment variable in your `.env` file.
**Single Agent Mode (Default)**
By default, the agent operates in single-agent mode using a specific goal. If unset, the default is `goal_event_flight_invoice`.
To set a specific single goal:
```bash
AGENT_GOAL=goal_event_flight_invoice
```
**Multi-Agent Mode (Experimental)**
The agent also supports an experimental multi-agent mode where users can choose between different agent types during the conversation. To enable this mode:
```bash
AGENT_GOAL=goal_choose_agent_type
```
When using multi-agent mode, you can control which agent categories are available using `GOAL_CATEGORIES` in your `.env` file. If unset, all categories are shown. Available categories include `hr`, `travel-flights`, `travel-trains`, `fin`, `ecommerce`, `mcp-integrations`, and `food`.
We recommend starting with `fin`:
```bash
GOAL_CATEGORIES=hr,travel-flights,travel-trains,fin
```
**Note:** Multi-agent mode is experimental and allows switching between different agents mid-conversation, but single-agent mode provides a more focused experience.
MCP (Model Context Protocol) tools are available for enhanced integration with external services. See the [MCP Tools Configuration](#mcp-tools-configuration) section for setup details.
See the section Goal-Specific Tool Configuration below for tool configuration for specific goals.
### LLM Configuration
Note: We recommend using OpenAI's GPT-4o or Claude 3.5 Sonnet for the best results. There can be significant differences in performance and capabilities between models, especially for complex tasks.
The agent uses LiteLLM to interact with various LLM providers. Configure the following environment variables in your `.env` file:
- `LLM_MODEL`: The model to use (e.g., "openai/gpt-4o", "anthropic/claude-3-sonnet", "google/gemini-pro", etc.)
- `LLM_KEY`: Your API key for the selected provider
- `LLM_BASE_URL`: (Optional) Custom base URL for the LLM provider. Useful for:
- Using Ollama with a custom endpoint
- Using a proxy or custom API gateway
- Testing with different API versions
LiteLLM will automatically detect the provider based on the model name. For example:
- For OpenAI models: `openai/gpt-4o` or `openai/gpt-3.5-turbo`
- For Anthropic models: `anthropic/claude-3-sonnet`
- For Google models: `google/gemini-pro`
- For Ollama models: `ollama/mistral` (requires `LLM_BASE_URL` set to your Ollama server)
Example configurations:
```bash
# For OpenAI
LLM_MODEL=openai/gpt-4o
LLM_KEY=your-api-key-here
# For Anthropic
LLM_MODEL=anthropic/claude-3-sonnet
LLM_KEY=your-api-key-here
# For Ollama with custom URL
LLM_MODEL=ollama/mistral
LLM_BASE_URL=http://localhost:11434
```
For a complete list of supported models and providers, visit the [LiteLLM documentation](https://docs.litellm.ai/docs/providers).
## Configuring Temporal Connection
By default, this application will connect to a local Temporal server (`localhost:7233`) in the default namespace, using the `agent-task-queue` task queue. You can override these settings in your `.env` file.
### Use Temporal Cloud
See [.env.example](.env.example) for details on connecting to Temporal Cloud using mTLS or API key authentication.
[Sign up for Temporal Cloud](https://temporal.io/get-cloud)
### Use a local Temporal Dev Server
On a Mac
```bash
brew install temporal
temporal server start-dev
```
See the [Temporal documentation](https://learn.temporal.io/getting_started/python/dev_environment/) for other platforms.
You can also run a local Temporal server using Docker Compose. See the `Development with Docker` section below.
## Running the Application
### Docker
- All services are defined in `docker-compose.yml` (includes a Temporal server).
- **Dev overrides** (mounted code, livereload commands) live in `docker-compose.override.yml` and are **automerged** on `docker compose up`.
- To start **development** mode (with hotreload):
```bash
docker compose up -d
# quick rebuild without infra:
docker compose up -d --no-deps --build api train-api worker frontend
```
- To run **production** mode (ignore dev overrides):
```bash
docker compose -f docker-compose.yml up -d
```
Default urls:
* Temporal UI: [http://localhost:8080](http://localhost:8080)
* API: [http://localhost:8000](http://localhost:8000)
* Frontend: [http://localhost:5173](http://localhost:5173)
### Local Machine (no docker)
**Python Backend**
Requires [Poetry](https://python-poetry.org/) to manage dependencies.
1. `python -m venv venv`
2. `source venv/bin/activate`
3. `poetry install`
Run the following commands in separate terminal windows:
1. Start the Temporal worker:
```bash
poetry run python scripts/run_worker.py
```
2. Start the API server:
```bash
poetry run uvicorn api.main:app --reload
```
Access the API at `/docs` to see the available endpoints.
**React UI**
Start the frontend:
```bash
cd frontend
npm install
npx vite
```
Access the UI at `http://localhost:5173`
## MCP Tools Configuration
MCP (Model Context Protocol) tools enable integration with external services without custom implementation. The system automatically handles MCP server lifecycle and tool discovery.
### Adding MCP Tools to Goals
Configure MCP servers in your goal definitions using either:
1. Predefined configurations from `shared/mcp_config.py`
2. Custom `MCPServerDefinition` objects
Example using Stripe MCP Server:
```python
from shared.mcp_config import get_stripe_mcp_server_definition
mcp_server_definition=get_stripe_mcp_server_definition(
included_tools=["list_products", "create_customer", "create_invoice"]
)
```
See the file `goals/stripe_mcp.py` for an example of how to use MCP tools in a an `AgentGoal`.
### MCP Environment Variables
Set required API keys and configuration in your `.env` file:
```bash
# For Stripe MCP Server
STRIPE_API_KEY=sk_test_your_stripe_key_here
```
`goal_event_flight_invoice` does not require a Stripe key. If `STRIPE_API_KEY` is unset, that scenario falls back to a mock invoice.
#### Accessing Your Test API Keys
It's free to sign up for a Stripe account and generate test keys (no real money is involved). Use the Developers Dashboard to create, reveal, delete, and rotate API keys. Navigate to the API Keys tab in your dashboard or visit [https://dashboard.stripe.com/test/apikeys](https://dashboard.stripe.com/test/apikeys) directly.
For detailed guidance on adding MCP tools, see [adding-goals-and-tools.md](./adding-goals-and-tools.md).
## Goal-Specific Tool Configuration
Here is configuration guidance for specific goals. Travel and financial goals have configuration & setup as below.
### Goal: Find an event in Australia / New Zealand, book flights to it and invoice the user for the cost
- `AGENT_GOAL=goal_event_flight_invoice` - Helps users find events, book flights, and arrange train travel with invoice generation
- This is the scenario in the [original video](https://www.youtube.com/watch?v=GEXllEH2XiQ)
#### Configuring Agent Goal: goal_event_flight_invoice
* The agent uses a mock function to search for events. This has zero configuration.
* **Flight Search**: The agent intelligently handles flight searches:
* **Default behavior**: If no `RAPIDAPI_KEY` is set, the agent generates realistic flight data with smart pricing based on route type (domestic, international, trans-Pacific)
* **Real API (optional)**: To use live flight data, set `RAPIDAPI_KEY` in your `.env` file
* It's free to sign up at [RapidAPI](https://rapidapi.com/apiheya/api/sky-scrapper)
* This API might be slow to respond, so you may want to increase the start to close timeout, `TOOL_ACTIVITY_START_TO_CLOSE_TIMEOUT` in `workflows/workflow_helpers.py`
* The smart generation creates realistic pricing (e.g., US-Australia routes $1200-1800, domestic flights $200-800) with appropriate airlines for each region
* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in `.env`
* It's free to sign up and get a key at [Stripe](https://stripe.com/) (test mode only, no real money)
* Set permissions for read-write on: `Credit Notes, Invoices, Customers and Customer Sessions`
* If you don't have a Stripe key, comment out the `STRIPE_API_KEY` in the `.env` file, and a dummy invoice will be created rather than a Stripe invoice. The function can be found in `tools/create_invoice.py` this is the default behavior for `goal_event_flight_invoice`.
### Goal: Find a Premier League match, book train tickets to it and invoice the user for the cost (Replay 2025 Keynote)
- `AGENT_GOAL=goal_match_train_invoice` - Focuses on Premier League match attendance with train booking and invoice generation
- This goal was part of [Temporal's Replay 2025 conference keynote demo](https://www.youtube.com/watch?v=YDxAWrIBQNE)
- Note, there is failure built in to this demo (the train booking step) to show how the agent can handle failures and retry. See Tool Configuration below for details.
#### Configuring Agent Goal: goal_match_train_invoice
NOTE: This goal was developed for an on-stage demo and has failure (and its resolution) built in to show how the agent can handle failures and retry.
* Omit `FOOTBALL_DATA_API_KEY` from .env for the `SearchFixtures` tool to automatically return mock Premier League fixtures. Finding a real match requires a key from [Football Data](https://www.football-data.org). Sign up for a free account, then see the 'My Account' page to get your API token.
* We use a mock function to search for trains. Start the train API server to use the real API: `python thirdparty/train_api.py`
* * The train activity is 'enterprise' so it's written in C# and requires a .NET runtime. See the [.NET backend](#net-(enterprise)-backend) section for details on running it.
* Requires a Stripe key for the `create_invoice` tool. Set this in the `STRIPE_API_KEY` environment variable in `.env`
* It's free to sign up and get a key at [Stripe](https://stripe.com/) (test mode only)
* If the key is missing this goal won't generate a real invoice only `goal_event_flight_invoice` falls back to a mock invoice
* If you're lazy go to `tools/create_invoice.py` and replace the `create_invoice` function with the mock `create_invoice_example` that exists in the same file.
##### Python Search Trains API
> Agent Goal: goal_match_train_invoice only
Required to search and book trains!
```bash
poetry run python thirdparty/train_api.py
# example url
# http://localhost:8080/api/search?from=london&to=liverpool&outbound_time=2025-04-18T09:00:00&inbound_time=2025-04-20T09:00:00
```
##### Python Train Legacy Worker
> Agent Goal: goal_match_train_invoice only
These are Python activities that fail (raise NotImplemented) to show how Temporal handles a failure. You can run these activities with.
```bash
poetry run python scripts/run_legacy_worker.py
```
The activity will fail and be retried infinitely. To rescue the activity (and its corresponding workflows), kill the worker and run the .NET one in the section below.
##### .NET (enterprise) Worker ;)
We have activities written in C# to call the train APIs.
```bash
cd enterprise
dotnet build # ensure you brew install dotnet@8 first!
dotnet run
```
If you're running your train API above on a different host/port then change the API URL in `Program.cs`. Otherwise, be sure to run it using `python thirdparty/train_api.py`.
#### Goals: FIN - Money Movement and Loan Application
Make sure you have the mock users you want (such as yourself) in [the account mock data file](./tools/data/customer_account_data.json).
- `AGENT_GOAL=goal_fin_move_money` - This scenario _can_ initiate a secondary workflow to move money. Check out [this repo](https://github.com/temporal-sa/temporal-money-transfer-java) - you'll need to get the worker running and connected to the same account as the agentic worker.
By default it will _not_ make a real workflow, it'll just fake it. If you get the worker running and want to start a workflow, in your [.env](./.env):
```bash
FIN_START_REAL_WORKFLOW=FALSE #set this to true to start a real workflow
```
- `AGENT_GOAL=goal_fin_loan_application` - This scenario _can_ initiate a secondary workflow to apply for a loan. Check out [this repo](https://github.com/temporal-sa/temporal-latency-optimization-scenarios) - you'll need to get the worker running and connected to the same account as the agentic worker.
By default it will _not_ make a real workflow, it'll just fake it. If you get the worker running and want to start a workflow, in your [.env](./.env):
```bash
FIN_START_REAL_WORKFLOW=FALSE #set this to true to start a real workflow
```
#### Goals: HR/PTO
Make sure you have the mock users you want in (such as yourself) in [the PTO mock data file](./tools/data/employee_pto_data.json).
#### Goals: Ecommerce
Make sure you have the mock orders you want in (such as those with real tracking numbers) in [the mock orders file](./tools/data/customer_order_data.json).
### Goal: Food Ordering with MCP Integration (Stripe Payment Processing)
- `AGENT_GOAL=goal_food_ordering` - Demonstrates food ordering with Stripe payment processing via MCP
- Uses Stripe's MCP Server ([Agent Toolkit](https://github.com/stripe/agent-toolkit/tree/main/modelcontextprotocol)) for payment operations
- Requires `STRIPE_API_KEY` in your `.env` file
- Requires products in Stripe with metadata key `use_case=food_ordering_demo`. Run `tools/food/setup/create_stripe_products.py` to set up pizza menu items
- Example of MCP tool integration without custom implementation
- This is an excellent demonstration of MCP (Model Context Protocol) capabilities
## Customizing the Agent Further
- `tool_registry.py` contains the mapping of tool names to tool definitions (so the AI understands how to use them)
- `goals/` contains descriptions of goals and the tools used to achieve them
- The tools themselves are defined in their own files in `/tools`
For more details, check out [adding goals and tools guide](./adding-goals-and-tools.md).
## Setup Checklist
[ ] copy `.env.example` to `.env` <br />
[ ] Select an LLM and add your API key to `.env` <br />
[ ] (Optional) set your starting goal and goal category in `.env` <br />
[ ] (Optional) configure your Temporal Cloud settings in `.env` <br />
[ ] `poetry run python scripts/run_worker.py` <br />
[ ] `poetry run uvicorn api.main:app --reload` <br />
[ ] `cd frontend`, `npm install`, `npx vite` <br />
[ ] Access the UI at `http://localhost:5173` <br />
And that's it! Happy AI Agent Exploring!

163
docs/testing.md Normal file
View File

@@ -0,0 +1,163 @@
# Testing the Temporal AI Agent
This guide provides instructions for running the comprehensive test suite for the Temporal AI Agent project.
## Quick Start
1. **Install dependencies**:
```bash
poetry install --with dev
```
2. **Run all tests**:
```bash
poetry run pytest
```
3. **Run with time-skipping for faster execution**:
```bash
poetry run pytest --workflow-environment=time-skipping
```
## Test Categories
### Unit Tests
- **Activity Tests**: `tests/test_tool_activities.py`
- LLM integration (mocked)
- Environment configuration
- JSON processing
- Dynamic tool execution
### Integration Tests
- **Workflow Tests**: `tests/test_agent_goal_workflow.py`
- Full workflow execution
- Signal and query handling
- State management
- Error scenarios
## Running Specific Tests
```bash
# Run only activity tests
poetry run pytest tests/test_tool_activities.py -v
# Run only workflow tests
poetry run pytest tests/test_agent_goal_workflow.py -v
# Run a specific test
poetry run pytest tests/test_tool_activities.py::TestToolActivities::test_sanitize_json_response -v
# Run tests matching a pattern
poetry run pytest -k "validation" -v
```
## Test Environment Options
### Local Environment (Default)
```bash
poetry run pytest --workflow-environment=local
```
### Time-Skipping Environment (Recommended for CI)
```bash
poetry run pytest --workflow-environment=time-skipping
```
### External Temporal Server
```bash
poetry run pytest --workflow-environment=localhost:7233
```
## Environment Variables
Tests can be configured with these environment variables:
- `LLM_MODEL`: Model for LLM testing (default: "openai/gpt-4")
- `LLM_KEY`: API key for LLM service (mocked in tests)
- `LLM_BASE_URL`: Custom LLM endpoint (optional)
## Test Coverage
The test suite covers:
✅ **Workflows**
- AgentGoalWorkflow initialization and execution
- Signal handling (user_prompt, confirm, end_chat)
- Query methods (conversation history, agent goal, tool data)
- State management and conversation flow
- Validation and error handling
✅ **Activities**
- ToolActivities class methods
- LLM integration (mocked)
- Environment variable handling
- JSON response processing
- Dynamic tool activity execution
✅ **Integration**
- End-to-end workflow execution
- Activity registration in workers
- Temporal client interactions
## Test Output
Successful test run example:
```
============================== test session starts ==============================
platform darwin -- Python 3.11.3, pytest-8.3.5, pluggy-1.5.0
rootdir: /Users/steveandroulakis/Documents/Code/agentic/temporal-demo/temporal-ai-agent
configfile: pyproject.toml
plugins: anyio-4.5.2, asyncio-0.26.0
collected 21 items
tests/test_tool_activities.py::TestToolActivities::test_sanitize_json_response PASSED
tests/test_tool_activities.py::TestToolActivities::test_parse_json_response_success PASSED
tests/test_tool_activities.py::TestToolActivities::test_get_wf_env_vars_default_values PASSED
...
============================== 21 passed in 12.5s ==============================
```
## Troubleshooting
### Common Issues
1. **Module not found errors**: Run `poetry install --with dev`
2. **Async warnings**: These are expected with pytest-asyncio and can be ignored
3. **Test timeouts**: Use `--workflow-environment=time-skipping` for faster execution
4. **Import errors**: Check that you're running tests from the project root directory
### Debugging Tests
Enable verbose logging:
```bash
poetry run pytest --log-cli-level=DEBUG -s
```
Run with coverage:
```bash
poetry run pytest --cov=workflows --cov=activities
```
## Continuous Integration
For CI environments, use:
```bash
poetry run pytest --workflow-environment=time-skipping --tb=short
```
## Additional Resources
- See `tests/README.md` for detailed testing documentation
- Review `tests/conftest.py` for available test fixtures
- Check individual test files for specific test scenarios
## Test Architecture
The tests use:
- **Temporal Testing Framework**: For workflow and activity testing
- **pytest-asyncio**: For async test support
- **unittest.mock**: For mocking external dependencies
- **Test Fixtures**: For consistent test data and setup
All external dependencies (LLM calls, file I/O) are mocked to ensure fast, reliable tests.

37
docs/todo.md Normal file
View File

@@ -0,0 +1,37 @@
# todo list
## General Agent Enhancements
[ ] Google's A2A is emerging as the standard way to hand off agents to other agents. We should examine implementing this soon.
[ ] Custom metrics/tracing is important for AI specific aspects such as number of LLM calls, number of bad LLM responses that require retrying, number of bad chat outcomes. We should add this.
[ ] Evals are very important in agents. We want to be able to 'judge' the agent's performance both in dev and production (AIOps). This will help us improve our agent's performance over time in a targeted fashion.
[ ] Dynamically switch LLMs on persistent failures: <br />
- detect failure in the activity using failurecount <br />
- activity switches to secondary LLM defined in .env
- activity reports switch to workflow
[ ] Collapse history/summarize chat after goal finished <br />
[ ] Write tests<br />
[ ] non-retry the api key error - "Invalid API Key provided: sk_test_**J..." and "AuthenticationError" <br />
[ ] add visual feedback when workflow starting <br />
[ ] enable user to list agents at any time - like end conversation - probably with a next step<br />
## Ideas for more goals and tools
[ ] Add fintech goals <br />
- Fraud Detection and Prevention - The AI monitors transactions across accounts, flagging suspicious activities (e.g., unusual spending patterns or login attempts) and autonomously freezing accounts or notifying customers and compliance teams.<br />
- Personalized Financial Advice - An AI agent analyzes a customers financial data (e.g., income, spending habits, savings, investments) and provides tailored advice, such as budgeting tips, investment options, or debt repayment strategies.<br />
- Portfolio Management and Rebalancing - The AI monitors a customers investment portfolio, rebalancing it automatically based on market trends, risk tolerance, and financial goals (e.g., shifting assets between stocks, bonds, or crypto).<br />
[ ] new loan/fraud check/update with start <br />
[ ] financial advise - args being freeform customer input about their financial situation, goals
[ ] tool is maybe a new tool asking the LLM to advise
[ ] for demo simulate failure - add utilities/simulated failures from pipeline demo <br />

View File

@@ -2,6 +2,7 @@ using System.Net.Http.Json;
using System.Text.Json;
using Temporalio.Activities;
using TrainSearchWorker.Models;
using Microsoft.Extensions.Logging;
namespace TrainSearchWorker.Activities;
@@ -23,6 +24,7 @@ public class TrainActivities
[Activity]
public async Task<JourneyResponse> SearchTrains(SearchTrainsRequest request)
{
ActivityExecutionContext.Current.Logger.LogInformation($"SearchTrains from {request.From} to {request.To}");
var response = await _client.GetAsync(
$"api/search?from={Uri.EscapeDataString(request.From)}" +
$"&to={Uri.EscapeDataString(request.To)}" +
@@ -30,17 +32,21 @@ public class TrainActivities
$"&return_time={Uri.EscapeDataString(request.ReturnTime)}");
response.EnsureSuccessStatusCode();
// Deserialize into JourneyResponse rather than List<Journey>
var journeyResponse = await response.Content.ReadFromJsonAsync<JourneyResponse>(_jsonOptions)
?? throw new InvalidOperationException("Received null response from API");
ActivityExecutionContext.Current.Logger.LogInformation("SearchTrains completed");
return journeyResponse;
}
[Activity]
public async Task<BookTrainsResponse> BookTrains(BookTrainsRequest request)
{
ActivityExecutionContext.Current.Logger.LogInformation($"Booking trains with IDs: {request.TrainIds}");
// Build the URL using the train IDs from the request
var url = $"api/book/{Uri.EscapeDataString(request.TrainIds)}";
@@ -52,6 +58,8 @@ public class TrainActivities
var bookingResponse = await response.Content.ReadFromJsonAsync<BookTrainsResponse>(_jsonOptions)
?? throw new InvalidOperationException("Received null response from API");
ActivityExecutionContext.Current.Logger.LogInformation("BookTrains completed");
return bookingResponse;
}

View File

@@ -2,10 +2,19 @@ using Microsoft.Extensions.DependencyInjection;
using Temporalio.Client;
using Temporalio.Worker;
using TrainSearchWorker.Activities;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Console;
// Set up dependency injection
var services = new ServiceCollection();
var loggerFactory = LoggerFactory.Create(builder =>
{
builder
.AddSimpleConsole(options => options.TimestampFormat = "[HH:mm:ss] ")
.SetMinimumLevel(LogLevel.Information);
});
// Add HTTP client
services.AddHttpClient("TrainApi", client =>
{
@@ -31,7 +40,10 @@ Console.WriteLine($"Connecting to Temporal at address: {address}");
Console.WriteLine($"Using namespace: {ns}");
// Create worker options
var options = new TemporalWorkerOptions("agent-task-queue-legacy");
var options = new TemporalWorkerOptions("agent-task-queue-legacy")
{
LoggerFactory = loggerFactory
};
// Register activities
var activities = serviceProvider.GetRequiredService<TrainActivities>();

View File

@@ -7,6 +7,7 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Logging.Console" Version="9.0.4" />
<PackageReference Include="Temporalio" Version="1.0.0" />
<PackageReference Include="Microsoft.Extensions.Http" Version="8.0.0" />
</ItemGroup>

View File

@@ -825,247 +825,228 @@
}
},
"node_modules/@rollup/rollup-android-arm-eabi": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.29.1.tgz",
"integrity": "sha512-ssKhA8RNltTZLpG6/QNkCSge+7mBQGUqJRisZ2MDQcEGaK93QESEgWK2iOpIDZ7k9zPVkG5AS3ksvD5ZWxmItw==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.34.7.tgz",
"integrity": "sha512-l6CtzHYo8D2TQ3J7qJNpp3Q1Iye56ssIAtqbM2H8axxCEEwvN7o8Ze9PuIapbxFL3OHrJU2JBX6FIIVnP/rYyw==",
"cpu": [
"arm"
],
"license": "MIT",
"optional": true,
"os": [
"android"
]
},
"node_modules/@rollup/rollup-android-arm64": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.29.1.tgz",
"integrity": "sha512-CaRfrV0cd+NIIcVVN/jx+hVLN+VRqnuzLRmfmlzpOzB87ajixsN/+9L5xNmkaUUvEbI5BmIKS+XTwXsHEb65Ew==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.34.7.tgz",
"integrity": "sha512-KvyJpFUueUnSp53zhAa293QBYqwm94TgYTIfXyOTtidhm5V0LbLCJQRGkQClYiX3FXDQGSvPxOTD/6rPStMMDg==",
"cpu": [
"arm64"
],
"license": "MIT",
"optional": true,
"os": [
"android"
]
},
"node_modules/@rollup/rollup-darwin-arm64": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.29.1.tgz",
"integrity": "sha512-2ORr7T31Y0Mnk6qNuwtyNmy14MunTAMx06VAPI6/Ju52W10zk1i7i5U3vlDRWjhOI5quBcrvhkCHyF76bI7kEw==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.34.7.tgz",
"integrity": "sha512-jq87CjmgL9YIKvs8ybtIC98s/M3HdbqXhllcy9EdLV0yMg1DpxES2gr65nNy7ObNo/vZ/MrOTxt0bE5LinL6mA==",
"cpu": [
"arm64"
],
"license": "MIT",
"optional": true,
"os": [
"darwin"
]
},
"node_modules/@rollup/rollup-darwin-x64": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.29.1.tgz",
"integrity": "sha512-j/Ej1oanzPjmN0tirRd5K2/nncAhS9W6ICzgxV+9Y5ZsP0hiGhHJXZ2JQ53iSSjj8m6cRY6oB1GMzNn2EUt6Ng==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.34.7.tgz",
"integrity": "sha512-rSI/m8OxBjsdnMMg0WEetu/w+LhLAcCDEiL66lmMX4R3oaml3eXz3Dxfvrxs1FbzPbJMaItQiksyMfv1hoIxnA==",
"cpu": [
"x64"
],
"license": "MIT",
"optional": true,
"os": [
"darwin"
]
},
"node_modules/@rollup/rollup-freebsd-arm64": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.29.1.tgz",
"integrity": "sha512-91C//G6Dm/cv724tpt7nTyP+JdN12iqeXGFM1SqnljCmi5yTXriH7B1r8AD9dAZByHpKAumqP1Qy2vVNIdLZqw==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.34.7.tgz",
"integrity": "sha512-oIoJRy3ZrdsXpFuWDtzsOOa/E/RbRWXVokpVrNnkS7npz8GEG++E1gYbzhYxhxHbO2om1T26BZjVmdIoyN2WtA==",
"cpu": [
"arm64"
],
"license": "MIT",
"optional": true,
"os": [
"freebsd"
]
},
"node_modules/@rollup/rollup-freebsd-x64": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.29.1.tgz",
"integrity": "sha512-hEioiEQ9Dec2nIRoeHUP6hr1PSkXzQaCUyqBDQ9I9ik4gCXQZjJMIVzoNLBRGet+hIUb3CISMh9KXuCcWVW/8w==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.34.7.tgz",
"integrity": "sha512-X++QSLm4NZfZ3VXGVwyHdRf58IBbCu9ammgJxuWZYLX0du6kZvdNqPwrjvDfwmi6wFdvfZ/s6K7ia0E5kI7m8Q==",
"cpu": [
"x64"
],
"license": "MIT",
"optional": true,
"os": [
"freebsd"
]
},
"node_modules/@rollup/rollup-linux-arm-gnueabihf": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.29.1.tgz",
"integrity": "sha512-Py5vFd5HWYN9zxBv3WMrLAXY3yYJ6Q/aVERoeUFwiDGiMOWsMs7FokXihSOaT/PMWUty/Pj60XDQndK3eAfE6A==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.34.7.tgz",
"integrity": "sha512-Z0TzhrsNqukTz3ISzrvyshQpFnFRfLunYiXxlCRvcrb3nvC5rVKI+ZXPFG/Aa4jhQa1gHgH3A0exHaRRN4VmdQ==",
"cpu": [
"arm"
],
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm-musleabihf": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.29.1.tgz",
"integrity": "sha512-RiWpGgbayf7LUcuSNIbahr0ys2YnEERD4gYdISA06wa0i8RALrnzflh9Wxii7zQJEB2/Eh74dX4y/sHKLWp5uQ==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.34.7.tgz",
"integrity": "sha512-nkznpyXekFAbvFBKBy4nNppSgneB1wwG1yx/hujN3wRnhnkrYVugMTCBXED4+Ni6thoWfQuHNYbFjgGH0MBXtw==",
"cpu": [
"arm"
],
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm64-gnu": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.29.1.tgz",
"integrity": "sha512-Z80O+taYxTQITWMjm/YqNoe9d10OX6kDh8X5/rFCMuPqsKsSyDilvfg+vd3iXIqtfmp+cnfL1UrYirkaF8SBZA==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.34.7.tgz",
"integrity": "sha512-KCjlUkcKs6PjOcxolqrXglBDcfCuUCTVlX5BgzgoJHw+1rWH1MCkETLkLe5iLLS9dP5gKC7mp3y6x8c1oGBUtA==",
"cpu": [
"arm64"
],
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-arm64-musl": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.29.1.tgz",
"integrity": "sha512-fOHRtF9gahwJk3QVp01a/GqS4hBEZCV1oKglVVq13kcK3NeVlS4BwIFzOHDbmKzt3i0OuHG4zfRP0YoG5OF/rA==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.34.7.tgz",
"integrity": "sha512-uFLJFz6+utmpbR313TTx+NpPuAXbPz4BhTQzgaP0tozlLnGnQ6rCo6tLwaSa6b7l6gRErjLicXQ1iPiXzYotjw==",
"cpu": [
"arm64"
],
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-loongarch64-gnu": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loongarch64-gnu/-/rollup-linux-loongarch64-gnu-4.29.1.tgz",
"integrity": "sha512-5a7q3tnlbcg0OodyxcAdrrCxFi0DgXJSoOuidFUzHZ2GixZXQs6Tc3CHmlvqKAmOs5eRde+JJxeIf9DonkmYkw==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loongarch64-gnu/-/rollup-linux-loongarch64-gnu-4.34.7.tgz",
"integrity": "sha512-ws8pc68UcJJqCpneDFepnwlsMUFoWvPbWXT/XUrJ7rWUL9vLoIN3GAasgG+nCvq8xrE3pIrd+qLX/jotcLy0Qw==",
"cpu": [
"loong64"
],
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-powerpc64le-gnu": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-powerpc64le-gnu/-/rollup-linux-powerpc64le-gnu-4.29.1.tgz",
"integrity": "sha512-9b4Mg5Yfz6mRnlSPIdROcfw1BU22FQxmfjlp/CShWwO3LilKQuMISMTtAu/bxmmrE6A902W2cZJuzx8+gJ8e9w==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-powerpc64le-gnu/-/rollup-linux-powerpc64le-gnu-4.34.7.tgz",
"integrity": "sha512-vrDk9JDa/BFkxcS2PbWpr0C/LiiSLxFbNOBgfbW6P8TBe9PPHx9Wqbvx2xgNi1TOAyQHQJ7RZFqBiEohm79r0w==",
"cpu": [
"ppc64"
],
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-riscv64-gnu": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.29.1.tgz",
"integrity": "sha512-G5pn0NChlbRM8OJWpJFMX4/i8OEU538uiSv0P6roZcbpe/WfhEO+AT8SHVKfp8qhDQzaz7Q+1/ixMy7hBRidnQ==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.34.7.tgz",
"integrity": "sha512-rB+ejFyjtmSo+g/a4eovDD1lHWHVqizN8P0Hm0RElkINpS0XOdpaXloqM4FBkF9ZWEzg6bezymbpLmeMldfLTw==",
"cpu": [
"riscv64"
],
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-s390x-gnu": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.29.1.tgz",
"integrity": "sha512-WM9lIkNdkhVwiArmLxFXpWndFGuOka4oJOZh8EP3Vb8q5lzdSCBuhjavJsw68Q9AKDGeOOIHYzYm4ZFvmWez5g==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.34.7.tgz",
"integrity": "sha512-nNXNjo4As6dNqRn7OrsnHzwTgtypfRA3u3AKr0B3sOOo+HkedIbn8ZtFnB+4XyKJojIfqDKmbIzO1QydQ8c+Pw==",
"cpu": [
"s390x"
],
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-x64-gnu": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.29.1.tgz",
"integrity": "sha512-87xYCwb0cPGZFoGiErT1eDcssByaLX4fc0z2nRM6eMtV9njAfEE6OW3UniAoDhX4Iq5xQVpE6qO9aJbCFumKYQ==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.34.7.tgz",
"integrity": "sha512-9kPVf9ahnpOMSGlCxXGv980wXD0zRR3wyk8+33/MXQIpQEOpaNe7dEHm5LMfyRZRNt9lMEQuH0jUKj15MkM7QA==",
"cpu": [
"x64"
],
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-linux-x64-musl": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.29.1.tgz",
"integrity": "sha512-xufkSNppNOdVRCEC4WKvlR1FBDyqCSCpQeMMgv9ZyXqqtKBfkw1yfGMTUTs9Qsl6WQbJnsGboWCp7pJGkeMhKA==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.34.7.tgz",
"integrity": "sha512-7wJPXRWTTPtTFDFezA8sle/1sdgxDjuMoRXEKtx97ViRxGGkVQYovem+Q8Pr/2HxiHp74SSRG+o6R0Yq0shPwQ==",
"cpu": [
"x64"
],
"license": "MIT",
"optional": true,
"os": [
"linux"
]
},
"node_modules/@rollup/rollup-win32-arm64-msvc": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.29.1.tgz",
"integrity": "sha512-F2OiJ42m77lSkizZQLuC+jiZ2cgueWQL5YC9tjo3AgaEw+KJmVxHGSyQfDUoYR9cci0lAywv2Clmckzulcq6ig==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.34.7.tgz",
"integrity": "sha512-MN7aaBC7mAjsiMEZcsJvwNsQVNZShgES/9SzWp1HC9Yjqb5OpexYnRjF7RmE4itbeesHMYYQiAtUAQaSKs2Rfw==",
"cpu": [
"arm64"
],
"license": "MIT",
"optional": true,
"os": [
"win32"
]
},
"node_modules/@rollup/rollup-win32-ia32-msvc": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.29.1.tgz",
"integrity": "sha512-rYRe5S0FcjlOBZQHgbTKNrqxCBUmgDJem/VQTCcTnA2KCabYSWQDrytOzX7avb79cAAweNmMUb/Zw18RNd4mng==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.34.7.tgz",
"integrity": "sha512-aeawEKYswsFu1LhDM9RIgToobquzdtSc4jSVqHV8uApz4FVvhFl/mKh92wc8WpFc6aYCothV/03UjY6y7yLgbg==",
"cpu": [
"ia32"
],
"license": "MIT",
"optional": true,
"os": [
"win32"
]
},
"node_modules/@rollup/rollup-win32-x64-msvc": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.29.1.tgz",
"integrity": "sha512-+10CMg9vt1MoHj6x1pxyjPSMjHTIlqs8/tBztXvPAx24SKs9jwVnKqHJumlH/IzhaPUaj3T6T6wfZr8okdXaIg==",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.34.7.tgz",
"integrity": "sha512-4ZedScpxxIrVO7otcZ8kCX1mZArtH2Wfj3uFCxRJ9NO80gg1XV0U/b2f/MKaGwj2X3QopHfoWiDQ917FRpwY3w==",
"cpu": [
"x64"
],
"license": "MIT",
"optional": true,
"os": [
"win32"
@@ -1115,8 +1096,7 @@
"node_modules/@types/estree": {
"version": "1.0.6",
"resolved": "https://registry.npmjs.org/@types/estree/-/estree-1.0.6.tgz",
"integrity": "sha512-AYnb1nQyY49te+VRAVgmzfcgjYS91mY5P0TKUDCLEM+gNnA+3T6rWITXRLYCpahpqSQbN5cE+gHpnPyXjHWxcw==",
"license": "MIT"
"integrity": "sha512-AYnb1nQyY49te+VRAVgmzfcgjYS91mY5P0TKUDCLEM+gNnA+3T6rWITXRLYCpahpqSQbN5cE+gHpnPyXjHWxcw=="
},
"node_modules/@vitejs/plugin-react": {
"version": "4.3.4",
@@ -2084,9 +2064,9 @@
}
},
"node_modules/postcss": {
"version": "8.4.49",
"resolved": "https://registry.npmjs.org/postcss/-/postcss-8.4.49.tgz",
"integrity": "sha512-OCVPnIObs4N29kxTjzLfUryOkvZEq+pf8jTF0lg8E7uETuWHA+v7j3c/xJmiqpX450191LlmZfUKkXxkTry7nA==",
"version": "8.5.2",
"resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.2.tgz",
"integrity": "sha512-MjOadfU3Ys9KYoX0AdkBlFEF1Vx37uCCeN4ZHnmwm9FfpbsGWMZeBLMmmpY+6Ocqod7mkdZ0DT31OlbsFrLlkA==",
"funding": [
{
"type": "opencollective",
@@ -2101,9 +2081,8 @@
"url": "https://github.com/sponsors/ai"
}
],
"license": "MIT",
"dependencies": {
"nanoid": "^3.3.7",
"nanoid": "^3.3.8",
"picocolors": "^1.1.1",
"source-map-js": "^1.2.1"
},
@@ -2339,10 +2318,9 @@
}
},
"node_modules/rollup": {
"version": "4.29.1",
"resolved": "https://registry.npmjs.org/rollup/-/rollup-4.29.1.tgz",
"integrity": "sha512-RaJ45M/kmJUzSWDs1Nnd5DdV4eerC98idtUOVr6FfKcgxqvjwHmxc5upLF9qZU9EpsVzzhleFahrT3shLuJzIw==",
"license": "MIT",
"version": "4.34.7",
"resolved": "https://registry.npmjs.org/rollup/-/rollup-4.34.7.tgz",
"integrity": "sha512-8qhyN0oZ4x0H6wmBgfKxJtxM7qS98YJ0k0kNh5ECVtuchIJ7z9IVVvzpmtQyT10PXKMtBxYr1wQ5Apg8RS8kXQ==",
"dependencies": {
"@types/estree": "1.0.6"
},
@@ -2354,25 +2332,25 @@
"npm": ">=8.0.0"
},
"optionalDependencies": {
"@rollup/rollup-android-arm-eabi": "4.29.1",
"@rollup/rollup-android-arm64": "4.29.1",
"@rollup/rollup-darwin-arm64": "4.29.1",
"@rollup/rollup-darwin-x64": "4.29.1",
"@rollup/rollup-freebsd-arm64": "4.29.1",
"@rollup/rollup-freebsd-x64": "4.29.1",
"@rollup/rollup-linux-arm-gnueabihf": "4.29.1",
"@rollup/rollup-linux-arm-musleabihf": "4.29.1",
"@rollup/rollup-linux-arm64-gnu": "4.29.1",
"@rollup/rollup-linux-arm64-musl": "4.29.1",
"@rollup/rollup-linux-loongarch64-gnu": "4.29.1",
"@rollup/rollup-linux-powerpc64le-gnu": "4.29.1",
"@rollup/rollup-linux-riscv64-gnu": "4.29.1",
"@rollup/rollup-linux-s390x-gnu": "4.29.1",
"@rollup/rollup-linux-x64-gnu": "4.29.1",
"@rollup/rollup-linux-x64-musl": "4.29.1",
"@rollup/rollup-win32-arm64-msvc": "4.29.1",
"@rollup/rollup-win32-ia32-msvc": "4.29.1",
"@rollup/rollup-win32-x64-msvc": "4.29.1",
"@rollup/rollup-android-arm-eabi": "4.34.7",
"@rollup/rollup-android-arm64": "4.34.7",
"@rollup/rollup-darwin-arm64": "4.34.7",
"@rollup/rollup-darwin-x64": "4.34.7",
"@rollup/rollup-freebsd-arm64": "4.34.7",
"@rollup/rollup-freebsd-x64": "4.34.7",
"@rollup/rollup-linux-arm-gnueabihf": "4.34.7",
"@rollup/rollup-linux-arm-musleabihf": "4.34.7",
"@rollup/rollup-linux-arm64-gnu": "4.34.7",
"@rollup/rollup-linux-arm64-musl": "4.34.7",
"@rollup/rollup-linux-loongarch64-gnu": "4.34.7",
"@rollup/rollup-linux-powerpc64le-gnu": "4.34.7",
"@rollup/rollup-linux-riscv64-gnu": "4.34.7",
"@rollup/rollup-linux-s390x-gnu": "4.34.7",
"@rollup/rollup-linux-x64-gnu": "4.34.7",
"@rollup/rollup-linux-x64-musl": "4.34.7",
"@rollup/rollup-win32-arm64-msvc": "4.34.7",
"@rollup/rollup-win32-ia32-msvc": "4.34.7",
"@rollup/rollup-win32-x64-msvc": "4.34.7",
"fsevents": "~2.3.2"
}
},
@@ -2719,14 +2697,13 @@
"license": "MIT"
},
"node_modules/vite": {
"version": "6.0.7",
"resolved": "https://registry.npmjs.org/vite/-/vite-6.0.7.tgz",
"integrity": "sha512-RDt8r/7qx9940f8FcOIAH9PTViRrghKaK2K1jY3RaAURrEUbm9Du1mJ72G+jlhtG3WwodnfzY8ORQZbBavZEAQ==",
"license": "MIT",
"version": "6.1.0",
"resolved": "https://registry.npmjs.org/vite/-/vite-6.1.0.tgz",
"integrity": "sha512-RjjMipCKVoR4hVfPY6GQTgveinjNuyLw+qruksLDvA5ktI1150VmcMBKmQaEWJhg/j6Uaf6dNCNA0AfdzUb/hQ==",
"dependencies": {
"esbuild": "^0.24.2",
"postcss": "^8.4.49",
"rollup": "^4.23.0"
"postcss": "^8.5.1",
"rollup": "^4.30.1"
},
"bin": {
"vite": "bin/vite.js"

View File

@@ -1,65 +1,153 @@
import React, { memo } from "react";
import React, { memo, useState } from "react";
/** Inline SVG icons so we dont need an extra library */
const PlayIcon = ({ className }) => (
<svg
xmlns="http://www.w3.org/2000/svg"
viewBox="0 0 24 24"
fill="currentColor"
className={className}
aria-hidden="true"
>
<path d="M5 3.868v16.264c0 1.04 1.12 1.675 2.025 1.16l13.11-8.132a1.33 1.33 0 000-2.256L7.025 2.773C6.12 2.259 5 2.894 5 3.934z" />
</svg>
);
const SpinnerIcon = ({ className }) => (
<svg
xmlns="http://www.w3.org/2000/svg"
viewBox="0 0 24 24"
fill="none"
stroke="currentColor"
strokeWidth="2"
strokeLinecap="round"
strokeLinejoin="round"
className={`animate-spin ${className}`}
aria-hidden="true"
>
<circle cx="12" cy="12" r="10" strokeOpacity="0.25" />
<path d="M22 12a10 10 0 00-10-10" />
</svg>
);
/**
* Userfriendly confirmation card that surfaces tool invocation details
* without developer jargon. Tweaks include:
* • Left green accentborder + compact heading (visual hierarchy)
* • Collapsible arg list & array support (argumentlist UX)
* • Mobilefirst, pulsing confirm button (button affordance)
*/
const ConfirmInline = memo(({ data, confirmed, onConfirm }) => {
const { args, tool } = data || {};
const { args = {}, tool } = data || {};
const renderArgs = () => {
if (!args) return null;
return (
<div className="mt-1">
<strong>Args:</strong>
<pre className="bg-gray-100 dark:bg-gray-700 p-1 rounded text-sm whitespace-pre-wrap overflow-x-auto">
{JSON.stringify(args, null, 2)}
</pre>
</div>
);
};
// Collapsible argument list if we have more than 4 root keys
const [showAll, setShowAll] = useState(false);
const argEntries = Object.entries(args);
const shouldCollapse = argEntries.length > 4 && !showAll;
if (confirmed) {
return (
<div className="mt-2 p-2 border border-gray-400 dark:border-gray-600 rounded
bg-gray-50 dark:bg-gray-800 transition-colors duration-200">
<div className="text-sm text-gray-600 dark:text-gray-300">
<div>
<strong>Tool:</strong> {tool ?? "Unknown"}
</div>
{renderArgs()}
</div>
<div className="mt-2 text-green-600 dark:text-green-400 font-medium">
Running {tool}...
</div>
</div>
);
/** Recursively prettyprint argument values (objects & arrays). */
const RenderValue = ({ value }) => {
if (value === null || value === undefined) return <span className="italic"></span>;
if (Array.isArray(value)) {
return (
<ol className="pl-4 list-decimal space-y-0.5">
{value.map((v, i) => (
<li key={i} className="flex gap-1">
<RenderValue value={v} />
</li>
))}
</ol>
);
}
if (typeof value === "object") {
return (
<ul className="pl-4 space-y-0.5 list-disc marker:text-green-500 dark:marker:text-green-400">
{Object.entries(value).map(([k, v]) => (
<li key={k} className="flex gap-1">
<span className="capitalize text-gray-600 dark:text-gray-300">{k}:&nbsp;</span>
<RenderValue value={v} />
</li>
))}
</ul>
);
}
return <span className="font-medium text-gray-800 dark:text-gray-100">{String(value)}</span>;
};
const cardBase =
"mt-2 p-3 rounded-lg border-l-4 border-green-500 bg-gray-100/60 dark:bg-gray-800/60 shadow-sm";
// ===== Running state =====
if (confirmed) {
return (
<div className="mt-2 p-2 border border-gray-400 dark:border-gray-600 rounded
bg-gray-50 dark:bg-gray-800 transition-colors duration-200">
<div className="text-gray-600 dark:text-gray-300">
<div>
Agent is ready to run the tool: <strong>{tool ?? "Unknown"}</strong>
</div>
{renderArgs()}
<div className="mt-2 text-sm text-gray-500 dark:text-gray-400">
Please confirm to proceed.
</div>
</div>
<div className="text-right mt-2">
<button
onClick={onConfirm}
className="bg-green-600 hover:bg-green-700 text-white px-3 py-1 rounded
transition-colors duration-200 focus:outline-none focus:ring-2
focus:ring-green-500 focus:ring-opacity-50"
aria-label={`Confirm running ${tool}`}
>
Confirm
</button>
</div>
</div>
<div className={`${cardBase} flex items-center gap-3`} role="status">
<SpinnerIcon className="text-green-600 dark:text-green-400 w-4 h-4" />
<span className="text-sm text-gray-700 dark:text-gray-200">
Running <strong className="font-semibold">{tool ?? "Unknown"}</strong>
</span>
</div>
);
}
// ===== Confirmation state =====
return (
<div className={`${cardBase} space-y-2`} role="group">
{/* Heading */}
<div className="flex items-center gap-2">
<PlayIcon className="text-green-600 dark:text-green-400 w-5 h-5 shrink-0" />
<p className="text-sm font-medium text-gray-700 dark:text-gray-200">
Ready to run <strong>{tool ?? "Unknown"}</strong>
</p>
</div>
{/* Dynamic argument list */}
{argEntries.length > 0 && (
<div className="text-sm text-gray-700 dark:text-gray-300">
{argEntries
.slice(0, shouldCollapse ? 4 : argEntries.length)
.map(([k, v]) => (
<div key={k} className="flex gap-1">
<span className="capitalize">{k}:&nbsp;</span>
<RenderValue value={v} />
</div>
))}
{shouldCollapse && (
<button
onClick={() => setShowAll(true)}
className="mt-1 text-green-600 dark:text-green-400 text-xs underline hover:no-underline"
>
show all
</button>
)}
{showAll && argEntries.length > 4 && (
<button
onClick={() => setShowAll(false)}
className="mt-1 block text-green-600 dark:text-green-400 text-xs underline hover:no-underline"
>
show less
</button>
)}
</div>
)}
{/* Confirm button */}
<div className="text-right">
<button
onClick={onConfirm}
onKeyDown={(e) => (e.key === "Enter" || e.key === " ") && onConfirm()}
className="w-full sm:w-auto bg-green-600 hover:bg-green-700 text-white text-sm px-3 py-1.5 rounded-md shadow-sm transition-colors focus:outline-none focus:ring-2 focus:ring-green-500 focus:ring-offset-1 animate-pulse sm:animate-none"
aria-label={`Confirm running ${tool}`}
>
Confirm
</button>
</div>
</div>
);
});
ConfirmInline.displayName = 'ConfirmInline';
ConfirmInline.displayName = "ConfirmInline";
export default ConfirmInline;
export default ConfirmInline;

View File

@@ -27,7 +27,7 @@ const LLMResponse = memo(({ data, onConfirm, isLastMessage, onHeightChange }) =>
: data?.response;
const displayText = (response || '').trim();
const requiresConfirm = data.next === "confirm" && isLastMessage;
const requiresConfirm = data.force_confirm && data.next === "confirm" && isLastMessage;
const defaultText = requiresConfirm
? `Agent is ready to run "${data.tool}". Please confirm.`
: '';

View File

@@ -8,26 +8,54 @@ const MessageBubble = memo(({ message, fallback = "", isUser = false }) => {
}
const renderTextWithLinks = (text) => {
// First handle image markdown: ![alt text](url)
const imageRegex = /!\[([^\]]*)\]\(([^)]+)\)/g;
const urlRegex = /(https?:\/\/[^\s]+)/g;
const parts = text.split(urlRegex);
return parts.map((part, index) => {
if (urlRegex.test(part)) {
// Split by image markdown first
const imageParts = text.split(imageRegex);
return imageParts.map((part, index) => {
// Every third element (starting from index 2) is an image URL
if (index > 0 && (index - 2) % 3 === 0) {
const altText = imageParts[index - 1];
const imageUrl = part;
return (
<a
<img
key={index}
href={part}
target="_blank"
rel="noopener noreferrer"
className="text-blue-500 hover:text-blue-600 underline"
aria-label={`External link to ${part}`}
>
{part}
</a>
src={imageUrl}
alt={altText}
className="max-w-full h-auto rounded mt-2 mb-2 mx-auto block border border-gray-300 dark:border-gray-600"
style={{ maxHeight: '200px' }}
loading="lazy"
/>
);
}
return part;
});
// Skip alt text parts (every second element after first)
if (index > 0 && (index - 1) % 3 === 0) {
return null;
}
// Handle regular text and links
const linkParts = part.split(urlRegex);
return linkParts.map((linkPart, linkIndex) => {
if (urlRegex.test(linkPart)) {
return (
<a
key={`${index}-${linkIndex}`}
href={linkPart}
target="_blank"
rel="noopener noreferrer"
className="text-blue-500 hover:text-blue-600 underline"
aria-label={`External link to ${linkPart}`}
>
{linkPart}
</a>
);
}
return linkPart;
});
}).filter(Boolean);
};
return (

View File

@@ -3,7 +3,7 @@ import NavBar from "../components/NavBar";
import ChatWindow from "../components/ChatWindow";
import { apiService } from "../services/api";
const POLL_INTERVAL = 500; // 0.5 seconds
const POLL_INTERVAL = 600; // 0.6 seconds
const INITIAL_ERROR_STATE = { visible: false, message: '' };
const DEBOUNCE_DELAY = 300; // 300ms debounce for user input

44
goals/__init__.py Normal file
View File

@@ -0,0 +1,44 @@
import os
from typing import List
import tools.tool_registry as tool_registry
from goals.agent_selection import agent_selection_goals
from goals.ecommerce import ecommerce_goals
from goals.finance import finance_goals
from goals.food import food_goals
from goals.hr import hr_goals
from goals.stripe_mcp import mcp_goals
from goals.travel import travel_goals
from models.tool_definitions import AgentGoal
goal_list: List[AgentGoal] = []
goal_list.extend(agent_selection_goals)
goal_list.extend(travel_goals)
goal_list.extend(hr_goals)
goal_list.extend(finance_goals)
goal_list.extend(ecommerce_goals)
goal_list.extend(mcp_goals)
goal_list.extend(food_goals)
# for multi-goal, just set list agents as the last tool
first_goal_value = os.getenv("AGENT_GOAL")
if first_goal_value is None:
multi_goal_mode = False # default to single agent mode if unset
elif (
first_goal_value is not None
and first_goal_value.lower() == "goal_choose_agent_type"
):
multi_goal_mode = True
else:
multi_goal_mode = False
if multi_goal_mode:
for goal in goal_list:
list_agents_found: bool = False
for tool in goal.tools:
if tool.name == "ListAgents":
list_agents_found = True
continue
if list_agents_found is False:
goal.tools.append(tool_registry.list_agents_tool)
continue

106
goals/agent_selection.py Normal file
View File

@@ -0,0 +1,106 @@
from typing import List
import tools.tool_registry as tool_registry
from models.tool_definitions import AgentGoal
# Turn on Silly Mode - this should be a description of the persona you'd like the bot to have and can be a single word or a phrase.
# Example if you want the bot to be a specific person, like Mario or Christopher Walken, or to describe a specific tone:
# SILLY_MODE="Christopher Walken"
# SILLY_MODE="belligerent"
#
# Example if you want it to take on a persona (include 'a'):
# SILLY_MODE="a pirate"
# Note - this only works with certain LLMs. Grok for sure will stay in character, while OpenAI will not.
SILLY_MODE = "off"
if SILLY_MODE is not None and SILLY_MODE != "off":
silly_prompt = "You are " + SILLY_MODE + ", stay in character at all times. "
print("Silly mode is on: " + SILLY_MODE)
else:
silly_prompt = ""
starter_prompt_generic = (
silly_prompt
+ "Welcome me, give me a description of what you can do, then ask me for the details you need to do your job."
)
goal_choose_agent_type = AgentGoal(
id="goal_choose_agent_type",
category_tag="agent_selection",
agent_name="Choose Agent",
agent_friendly_description="Choose the type of agent to assist you today. You can always interrupt an existing agent to pick a new one.",
tools=[
tool_registry.list_agents_tool,
tool_registry.change_goal_tool,
],
description="The user wants to choose which type of agent they will interact with. "
"Help the user select an agent by gathering args for the Changegoal tool, in order: "
"1. ListAgents: List agents available to interact with. Do not ask for user confirmation for this tool. "
"2. ChangeGoal: Change goal of agent "
"After these tools are complete, change your goal to the new goal as chosen by the user. ",
starter_prompt=silly_prompt
+ "Welcome me, give me a description of what you can do, then ask me for the details you need to do your job. List all details of all agents as provided by the output of the first tool included in this goal. ",
example_conversation_history="\n ".join(
[
"agent: Here are the currently available agents.",
"tool_result: { agents: 'agent_name': 'Event Flight Finder', 'goal_id': 'goal_event_flight_invoice', 'agent_description': 'Helps users find interesting events and arrange travel to them',"
"'agent_name': 'Schedule PTO', 'goal_id': 'goal_hr_schedule_pto', 'agent_description': 'Schedule PTO based on your available PTO.' }",
"agent: The available agents are: Event Flight Finder and Schedule PTO. \n Which agent would you like to work with? ",
"user: I'd like to find an event and book flights using the Event Flight Finder",
"user_confirmed_tool_run: <user clicks confirm on ChangeGoal tool>",
"tool_result: { 'new_goal': 'goal_event_flight_invoice' }",
]
),
)
# Easter egg - if silly mode = a pirate, include goal_pirate_treasure as a "system" goal so it always shows up.
# Can also turn make this goal available by setting the GOAL_CATEGORIES in the env file to include 'pirate', but if SILLY_MODE
# is not 'a pirate', the interaction as a whole will be less pirate-y.
pirate_category_tag = "pirate"
if SILLY_MODE == "a pirate":
pirate_category_tag = "system"
goal_pirate_treasure = AgentGoal(
id="goal_pirate_treasure",
category_tag=pirate_category_tag,
agent_name="Arrr, Find Me Treasure!",
agent_friendly_description="Sail the high seas and find me pirate treasure, ye land lubber!",
tools=[
tool_registry.give_hint_tool,
tool_registry.guess_location_tool,
],
description="The user wants to find a pirate treasure. "
"Help the user gather args for these tools, in a loop, until treasure_found is True or the user requests to be done: "
"1. GiveHint: If the user wants a hint regarding the location of the treasure, give them a hint. If they do not want a hint, this tool is optional."
"2. GuessLocation: The user guesses where the treasure is, by giving an address. ",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to try to find the treasure",
"agent: Sure! Do you want a hint?",
"user: yes",
"agent: Here is hint number 1!",
"user_confirmed_tool_run: <user clicks confirm on GiveHint tool>",
"tool_result: { 'hint_number': 1, 'hint': 'The treasure is in the state of Arizona.' }",
"agent: The treasure is in the state of Arizona. Would you like to guess the address of the treasure? ",
"user: Yes, address is 123 Main St Phoenix, AZ",
"agent: Let's see if you found the treasure...",
"user_confirmed_tool_run: <user clicks confirm on GuessLocation tool>"
"tool_result: {'treasure_found':False}",
"agent: Nope, that's not the right location! Do you want another hint?",
"user: yes",
"agent: Here is hint number 2.",
"user_confirmed_tool_run: <user clicks confirm on GiveHint tool>",
"tool_result: { 'hint_number': 2, 'hint': 'The treasure is in the city of Tucson, AZ.' }",
"agent: The treasure is in the city of Tucson, AZ. Would you like to guess the address of the treasure? ",
"user: Yes, address is 456 Main St Tucson, AZ",
"agent: Let's see if you found the treasure...",
"user_confirmed_tool_run: <user clicks confirm on GuessLocation tool>",
"tool_result: {'treasure_found':True}",
"agent: Congratulations, Land Lubber, you've found the pirate treasure!",
]
),
)
agent_selection_goals: List[AgentGoal] = [
goal_choose_agent_type,
goal_pirate_treasure,
]

83
goals/ecommerce.py Normal file
View File

@@ -0,0 +1,83 @@
from typing import List
import tools.tool_registry as tool_registry
from models.tool_definitions import AgentGoal
starter_prompt_generic = "Welcome me, give me a description of what you can do, then ask me for the details you need to do your job."
goal_ecomm_order_status = AgentGoal(
id="goal_ecomm_order_status",
category_tag="ecommerce",
agent_name="Check Order Status",
agent_friendly_description="Check the status of your order.",
tools=[
tool_registry.ecomm_get_order,
tool_registry.ecomm_track_package,
],
description="The user wants to learn the status of a specific order. If the status is 'shipped' or 'delivered', they might want to get the package tracking information. To assist with that goal, help the user gather args for these tools in order: "
"1. GetOrder: get information about an order"
"2. TrackPackage: provide tracking information for the package. This tool is optional and should only be offered if the status is 'shipped' OR 'delivered' - otherwise, skip this tool and do not mention it to the user.",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to know the status of my order",
"agent: Sure! I can help you out with that. May I have your order number?",
"user: 102 ",
"user_confirmed_tool_run: <user clicks confirm on GetOrderStatus tool>",
"tool_result: { 'id': '102', 'summary': 'Red Sunglasses', 'email': 'matt.murdock@nelsonmurdock.com', 'status': 'shipped', 'order_date': '2025-04-01', 'last_order_update': '2025-04-06', 'tracking_id': '039813852990618' }",
"agent: Your order 'Red Sunglasses,' placed April 1, 2025, was shipped on April 6, 2025. Would you like to see the tracking inforation?",
"user: Yes",
"user_confirmed_tool_run: <user clicks confirm on TrackPackage tool>",
"tool_result: { 'scheduled_delivery_date': 'April 30, 2025', 'carrier': 'USPS', 'status_summary': 'Your item has left our acceptance facility and is in transit to a sorting facility on April 10, 2025 at 7:06 am in IRON RIDGE, WI 53035.', 'tracking_link': 'https://tools.usps.com/go/TrackConfirmAction?qtc_tLabels1=12345','last_tracking_update': '2025-03-22T16:14:48'}",
"agent: Your package is scheduled to be delivered on April 30, 2025 via USPS. Here is the most recent status from them regarding your package, updated as of March 22: \\n"
"Your item has left our acceptance facility and is in transit to a sorting facility on April 10, 2025 at 7:06 am in IRON RIDGE, WI 53035. \\n"
"You can find the full tracking details here: tracking_link !",
]
),
)
goal_ecomm_list_orders = AgentGoal(
id="goal_ecomm_list_orders",
category_tag="ecommerce",
agent_name="List All Orders",
agent_friendly_description="List all orders for a user.",
tools=[
tool_registry.ecomm_list_orders,
tool_registry.ecomm_get_order,
tool_registry.ecomm_track_package,
],
description="The user wants to see all of their orders. They may want more details about specific orders, and if the status of an order is 'shipped' or 'delivered', they might want to get the package tracking information. To assist with that goal, help the user gather args for this tool: "
"1. ListOrders: list orders for a user"
" and then offer the following tools, in a loop, until the user indicates they are done:"
"2. GetOrder: get information about an order. This tool is optional."
"3. TrackPackage: provide tracking information for the package. This tool is optional and should only be offered if the status is 'shipped' OR 'delivered' - otherwise, skip this tool and do not mention it to the user.",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to see all of my orders.",
"agent: Sure! I can help you out with that. May I have your email address?",
"user: email is bob.johnson@emailzzz.com ",
"user_confirmed_tool_run: <user clicks confirm on ListOrders tool>",
"tool_result: a list of orders including [{'id': '102', 'summary': 'Red Sunglasses', 'email': 'matt.murdock@nelsonmurdock.com', 'status': 'shipped', 'order_date': '2025-04-01', 'last_order_update': '2025-04-06', 'tracking_id': '039813852990618' }, { 'id': '103', 'summary': 'Blue Sunglasses', 'email': 'matt.murdock@nelsonmurdock.com', 'status': 'paid', 'order_date': '2025-04-03', 'last_order_update': '2025-04-07' }]",
"agent: Your orders are as follows: \\n",
"1. Red Sunglasses, ordered 4/1/2025 \\n",
"2. Blue Sunglasses, ordered 4/3/2025 \\n",
"Would you like more information about any of your orders?"
"user: Yes, the Red Sunglasses",
"agent: Your order 'Red Sunglasses,' placed April 1, 2025, was shipped on April 6, 2025. Would you like to see the tracking inforation?",
"user: Yes",
"user_confirmed_tool_run: <user clicks confirm on TrackPackage tool>",
"tool_result: { 'scheduled_delivery_date': 'April 30, 2025', 'carrier': 'USPS', 'status_summary': 'Your item has left our acceptance facility and is in transit to a sorting facility on April 10, 2025 at 7:06 am in IRON RIDGE, WI 53035.', 'tracking_link': 'https://tools.usps.com/go/TrackConfirmAction?qtc_tLabels1=12345','last_tracking_update': '2025-03-22T16:14:48'}",
"agent: Your package is scheduled to be delivered on April 30, 2025 via USPS. Here is the most recent status from them regarding your package \\n, updated as of March 22: \\n"
"Your item has left our acceptance facility and is in transit to a sorting facility on April 10, 2025 at 7:06 am in IRON RIDGE, WI 53035. \\n"
"You can find the full tracking details here: tracking_link ! \\n"
"Would you like more information about any of your other orders?",
"user: No" "agent: Thanks, and have a great day!",
]
),
)
ecommerce_goals: List[AgentGoal] = [
goal_ecomm_order_status,
goal_ecomm_list_orders,
]

111
goals/finance.py Normal file
View File

@@ -0,0 +1,111 @@
from typing import List
import tools.tool_registry as tool_registry
from models.tool_definitions import AgentGoal
starter_prompt_generic = "Welcome me, give me a description of what you can do, then ask me for the details you need to do your job."
goal_fin_check_account_balances = AgentGoal(
id="goal_fin_check_account_balances",
category_tag="fin",
agent_name="Account Balances",
agent_friendly_description="Check your account balances in Checking, Savings, etc.",
tools=[
tool_registry.financial_check_account_is_valid,
tool_registry.financial_get_account_balances,
],
description="The user wants to check their account balances at the bank or financial institution. To assist with that goal, help the user gather args for these tools in order: "
"1. FinCheckAccountIsValid: validate the user's account is valid"
"2. FinCheckAccountBalance: Tell the user their account balance at the bank or financial institution",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to check my account balances",
"agent: Sure! I can help you out with that. May I have your email address and account number?",
"user: email is bob.johnson@emailzzz.com ",
"user_confirmed_tool_run: <user clicks confirm on FincheckAccountIsValid tool>",
"tool_result: { 'status': account valid }",
"agent: Great! I can tell you what the your account balances are.",
"user_confirmed_tool_run: <user clicks confirm on FinCheckAccountBalance tool>",
"tool_result: { 'name': Matt Murdock, 'email': matt.murdock@nelsonmurdock.com, 'account_id': 11235, 'checking_balance': 875.40, 'savings_balance': 3200.15, 'bitcoin_balance': 0.1378, 'account_creation_date': 2014-03-10 }",
"agent: Your account balances are as follows: \\n "
"Checking: $875.40. \\n "
"Savings: $3200.15. \\n "
"Bitcoin: 0.1378 \\n "
"Thanks for being a customer since 2014!",
]
),
)
goal_fin_move_money = AgentGoal(
id="goal_fin_move_money",
category_tag="fin",
agent_name="Money Movement",
agent_friendly_description="Initiate money movement.",
tools=[
tool_registry.financial_check_account_is_valid,
tool_registry.financial_get_account_balances,
tool_registry.financial_move_money,
],
description="The user wants to transfer money in their account at the bank or financial institution. To assist with that goal, help the user gather args for these tools in order: "
"1. FinCheckAccountIsValid: validate the user's account is valid"
"2. FinCheckAccountBalance: Tell the user their account balance at the bank or financial institution"
"3. FinMoveMoney: Initiate money movement (transfer)",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to transfer some money",
"agent: Sure! I can help you out with that. May I have account number and email address?",
"user: my account number is 11235 and my email address is matt.murdock@nelsonmurdock.com",
"user_confirmed_tool_run: <user clicks confirm on FincheckAccountIsValid tool>",
"tool_result: { 'status': account valid }",
"agent: Great! Here are your account balances:",
"user_confirmed_tool_run: <user clicks confirm on FinCheckAccountBalance tool>",
"tool_result: { 'name': Matt Murdock, 'email': matt.murdock@nelsonmurdock.com, 'account_id': 11235, 'checking_balance': 875.40, 'savings_balance': 3200.15, 'bitcoin_balance': 0.1378, 'account_creation_date': 2014-03-10 }",
"agent: Your account balances are as follows: \\n "
"Checking: $875.40. \\n "
"Savings: $3200.15. \\n "
"Bitcoint: 0.1378 \\n "
"agent: how much would you like to move, from which account type, and to which account number?",
"user: I'd like to move $500 from savings to account number #56789",
"user_confirmed_tool_run: <user clicks confirm on FinMoveMoney tool>",
"tool_result: { 'status': money movement complete, 'confirmation id': 333421, 'new_balance': $2700.15 }",
"agent: Money movement completed! New account balance: $2700.15. Your confirmation id is 333421. ",
]
),
)
goal_fin_loan_application = AgentGoal(
id="goal_fin_loan_application",
category_tag="fin",
agent_name="Easy Loan",
agent_friendly_description="Initiate a simple loan application.",
tools=[
tool_registry.financial_check_account_is_valid,
tool_registry.financial_submit_loan_approval,
],
description="The user wants to apply for a loan at the financial institution. To assist with that goal, help the user gather args for these tools in order: "
"1. FinCheckAccountIsValid: validate the user's account is valid"
"2. FinCheckAccountSubmitLoanApproval: submit the loan for approval",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to apply for a loan",
"agent: Sure! I can help you out with that. May I have account number and email address to validate your account?",
"user: account number is 11235813",
"user_confirmed_tool_run: <user clicks confirm on FincheckAccountIsValid tool>",
"tool_result: { 'status': account valid }",
"agent: Great! We've validated your account. What will the loan amount be?",
"user: I'd like a loan for $500",
"user_confirmed_tool_run: <user clicks confirm on FinCheckAccountSubmitLoanApproval tool>",
"tool_result: { 'status': submitted, 'detailed_status': loan application is submitted and initial validation is complete, 'confirmation id': 333421, 'next_step': You'll receive a confirmation for final approval in three business days }",
"agent: I have submitted your loan application process and the initial validation is successful. Your application ID is 333421. You'll receive a notification for final approval from us in three business days. ",
]
),
)
finance_goals: List[AgentGoal] = [
goal_fin_check_account_balances,
goal_fin_move_money,
goal_fin_loan_application,
]

84
goals/food.py Normal file
View File

@@ -0,0 +1,84 @@
from typing import List
import tools.tool_registry as tool_registry
from models.tool_definitions import AgentGoal
from shared.mcp_config import get_stripe_mcp_server_definition
starter_prompt_generic = "Welcome me, give me a description of what you can do, then ask me for the details you need to do your job."
goal_food_ordering = AgentGoal(
id="goal_food_ordering",
category_tag="food",
agent_name="Food Ordering Assistant",
agent_friendly_description="Order food from Tony's Pizza Palace using Stripe for payment processing. Browse menu, add items to your order, and check out securely. Please ensure context carries over between tool runs.",
tools=[tool_registry.food_add_to_cart_tool],
mcp_server_definition=get_stripe_mcp_server_definition(
included_tools=[
"list_products",
"list_prices",
"create_customer",
"create_invoice",
"create_invoice_item",
"finalize_invoice",
]
),
description="The user wants to order food from Tony's Pizza Palace. "
"First, help the user browse the menu by calling list_products. You can refer to this data throughout the conversation to answer questions (without re-calling this tool). If menu items contain links to images, you can use markdown to display them e.g. ![Pepperoni Pizza](https://...)"
"When they express interest in items, get pricing using list_prices. "
"Add items to their cart using AddToCart as they decide - the order doesn't matter, multiple items can be added. "
"After they're done selecting items, get their customer details and create a Stripe customer. "
"For checkout: 1) create_invoice, 2) create_invoice_item for each individual item (IMPORTANT: create_invoice_item does NOT accept quantity parameter - call it once per item, so if user wants 2 pizzas, call create_invoice_item twice with the same price), "
"3) finalize_invoice. The finalized invoice will contain a hosted_invoice_url for payment.",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I want to order food.",
"agent: Welcome to Tony's Pizza Palace! I can help you browse our menu, add items to your cart, and process payment. Let me show you what we have available.",
"user_confirmed_tool_run: <user clicks confirm on list_products tool with limit=100 and product.metadata.use_case == 'food_ordering_demo'>",
'tool_result: {"products": [{"id": "prod_pepperoni", "name": "Pepperoni Pizza", "description": "A timeless classic featuring pepperoni", "images": ["https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfcGRHc0c4cEZYWmR2bm0zOHBOa0FWMk5t008QmCJoWr"]}, {"id": "prod_margherita", "name": "Margherita Pizza", "description": "Fresh mozzarella and basil", "images": ["https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfamdmTXBFbzY0TW9rS2N0c2g0Tml2SERL00Evl60Ttq"]}, {"id": "prod_garlic_bread", "name": "Garlic Bread", "description": "Warm crusty bread with garlic butter", "images": ["https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfWTdIZTBkUjNZNFQ1ZEhSVG9nRnduY1pS00XVgLRRZD"]}, {"id": "prod_soda", "name": "Soda", "description": "A large bottle of cola", "images": ["https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfbDJxckJKMDRnT1dDc243OHlZNWNkZkY5006Xg07kHT"]}]}',
"agent: Welcome to our menu! Here's what we have available:\\n\\n**Pepperoni Pizza** - A timeless classic featuring pepperoni\\n![Pepperoni Pizza](https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfcGRHc0c4cEZYWmR2bm0zOHBOa0FWMk5t008QmCJoWr)\\n\\n**Margherita Pizza** - Fresh mozzarella and basil\\n![Margherita Pizza](https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfamdmTXBFbzY0TW9rS2N0c2g0Tml2SERL00Evl60Ttq)\\n\\n**Garlic Bread** - Warm crusty bread with garlic butter\\n![Garlic Bread](https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfWTdIZTBkUjNZNFQ1ZEhSVG9nRnduY1pS00XVgLRRZD)\\n\\n**Soda** - A large bottle of cola\\n![Soda](https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfbDJxckJKMDRnT1dDc243OHlZNWNkZkY5006Xg07kHT)\\n\\nWhat would you like to order?",
"user: I'd like 2 pepperoni pizzas and a soda.",
"agent: Excellent! Let me get the pricing for the pepperoni pizzas first so you know the cost.",
"user_confirmed_tool_run: <user clicks confirm on list_prices tool with product='prod_pepperoni'>",
'tool_result: {"prices": [{"id": "price_pepperoni_large", "product": "prod_pepperoni", "unit_amount": 2299, "currency": "usd"}]}',
"agent: Pepperoni Pizza is $22.99 each. Now let me check the soda pricing:",
"user_confirmed_tool_run: <user clicks confirm on list_prices tool with product='prod_soda'>",
'tool_result: {"prices": [{"id": "price_soda", "product": "prod_soda", "unit_amount": 250, "currency": "usd"}]}',
"agent: And Soda is $2.50. So 2 Pepperoni Pizzas ($22.99 each) plus 1 Soda ($2.50) = $48.48 total. What's your email address to add these items to your cart?",
"user: jane.smith@example.com",
"agent: Perfect! Adding 2 Pepperoni Pizzas to your cart for jane.smith@example.com. Note: I'm using quantity=2 since you want 2 pizzas.",
"user_confirmed_tool_run: <user clicks confirm on AddToCart tool with customer_email='jane.smith@example.com', item_name='Pepperoni Pizza', item_price=22.99, quantity=2, stripe_product_id='prod_pepperoni'>",
'tool_result: {"status": "success", "message": "Added 2 x Pepperoni Pizza ($22.99) to cart for jane.smith@example.com", "item_added": {"quantity": 2, "name": "Pepperoni Pizza"}}',
"agent: Great! Added 2 Pepperoni Pizzas to your cart. Now adding the Soda (quantity=1):",
"user_confirmed_tool_run: <user clicks confirm on AddToCart tool with customer_email='jane.smith@example.com', item_name='Soda', item_price=2.50, quantity=1, stripe_product_id='prod_soda'>",
'tool_result: {"status": "success", "message": "Added 1 x Soda ($2.50) to cart for jane.smith@example.com", "item_added": {"quantity": 1, "name": "Soda"}}',
"agent: Perfect! Added 1 Soda to your cart. Your cart now contains: 2 Pepperoni Pizzas ($22.99 each) and 1 Soda ($2.50). What else would you like to order?",
"user: Nothing, that's all.",
"agent: Great! Your order is complete with 2 Pepperoni Pizzas and 1 Soda for a total of $48.48. Now let's proceed to checkout. I'll need your full name to create your customer profile.",
"user: My name is Jane Smith.",
"agent: Thank you Jane! Creating your Stripe customer profile with name and email.",
"user_confirmed_tool_run: <user clicks confirm on create_customer tool with name='Jane Smith' and email='jane.smith@example.com'>",
'tool_result: {"customer": {"id": "cus_ExAmPlE12345", "name": "Jane Smith", "email": "jane.smith@example.com"}}',
"agent: Customer profile created! Now I'll create a draft invoice for your order.",
"user_confirmed_tool_run: <user clicks confirm on create_invoice tool with customer='cus_ExAmPlE12345'>",
'tool_result: {"invoice": {"id": "in_InvOicE54321", "status": "draft", "customer": "cus_ExAmPlE12345"}}',
"agent: Invoice created! Now adding your first Pepperoni Pizza to the invoice.",
"user_confirmed_tool_run: <user clicks confirm on create_invoice_item tool with customer='cus_ExAmPlE12345', invoice='in_InvOicE54321', price='price_pepperoni_large'>",
'tool_result: {"status": "success", "invoice_item": {"id": "ii_pizza1"}}',
"agent: First pizza added! Now adding your second Pepperoni Pizza to the invoice.",
"user_confirmed_tool_run: <user clicks confirm on create_invoice_item tool with customer='cus_ExAmPlE12345', invoice='in_InvOicE54321', price='price_pepperoni_large'>",
'tool_result: {"status": "success", "invoice_item": {"id": "ii_pizza2"}}',
"agent: Both pizzas added! Now adding your Soda to the invoice.",
"user_confirmed_tool_run: <user clicks confirm on create_invoice_item tool with customer='cus_ExAmPlE12345', invoice='in_InvOicE54321', price='price_soda'>",
'tool_result: {"status": "success", "invoice_item": {"id": "ii_soda"}}',
"agent: All items added to invoice! Now finalizing to generate the payment link.",
"user_confirmed_tool_run: <user clicks confirm on finalize_invoice tool with invoice='in_InvOicE54321'>",
'tool_result: {"invoice": {"id": "in_InvOicE54321", "status": "open", "hosted_invoice_url": "https://invoice.stripe.com/i/acct_1NBOLuKVZbzw7QA5/test_YWNjdF8xTkJPTHVLVlpienc3UUE1LF9SbGJlZ2xLMjhlc3lxNzFkVVBXbkswOUlFa2Rna1RxLDEzMDAwMTY4Nw0200RKeJRMlw?s=ap", "amount_due": 4848}}',
"agent: Perfect! Your order is complete with 2 Pepperoni Pizzas and 1 Soda. Total: $48.48. Pay securely here: https://invoice.stripe.com/i/acct_1NBOLuKVZbzw7QA5/test_YWNjdF8xTkJPTHVLVlpienc3UUE1LF9SbGJlZ2xLMjhlc3lxNzFkVVBXbkswOUlFa2Rna1RxLDEzMDAwMTY4Nw0200RKeJRMlw?s=ap\\\\n\\\\nThank you for ordering from Tony's Pizza Palace!",
]
),
)
food_goals: List[AgentGoal] = [
goal_food_ordering,
]

97
goals/hr.py Normal file
View File

@@ -0,0 +1,97 @@
from typing import List
import tools.tool_registry as tool_registry
from models.tool_definitions import AgentGoal
starter_prompt_generic = "Welcome me, give me a description of what you can do, then ask me for the details you need to do your job."
goal_hr_schedule_pto = AgentGoal(
id="goal_hr_schedule_pto",
category_tag="hr",
agent_name="Schedule PTO",
agent_friendly_description="Schedule PTO based on your available PTO.",
tools=[
tool_registry.current_pto_tool,
tool_registry.future_pto_calc_tool,
tool_registry.book_pto_tool,
],
description="The user wants to schedule paid time off (PTO) after today's date. To assist with that goal, help the user gather args for these tools in order: "
"1. CurrentPTO: Tell the user how much PTO they currently have "
"2. FuturePTOCalc: Tell the user how much PTO they will have as of the prospective future date "
"3. BookPTO: Book PTO after user types 'yes'",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to schedule some time off",
"agent: Sure! Let's start by determining how much PTO you currently have. May I have your email address?",
"user: bob.johnson@emailzzz.com",
"agent: Great! I can tell you how much PTO you currently have accrued.",
"user_confirmed_tool_run: <user clicks confirm on CurrentPTO tool>",
"tool_result: { 'num_hours': 400, 'num_days': 50 }",
"agent: You have 400 hours, or 50 days, of PTO available. What dates would you like to take your time off? ",
"user: Dec 1 through Dec 5",
"agent: Let's check if you'll have enough PTO accrued by Dec 1 of this year to accomodate that.",
"user_confirmed_tool_run: <user clicks confirm on FuturePTO tool>"
'tool_result: {"enough_pto": True, "pto_hrs_remaining_after": 410}',
"agent: You do in fact have enough PTO to accommodate that, and will have 410 hours remaining after you come back. Do you want to book the PTO? ",
"user: yes ",
"user_confirmed_tool_run: <user clicks confirm on BookPTO tool>",
'tool_result: { "status": "success" }',
"agent: PTO successfully booked! ",
]
),
)
goal_hr_check_pto = AgentGoal(
id="goal_hr_check_pto",
category_tag="hr",
agent_name="Check PTO Amount",
agent_friendly_description="Check your available PTO.",
tools=[
tool_registry.current_pto_tool,
],
description="The user wants to check their paid time off (PTO) after today's date. To assist with that goal, help the user gather args for these tools in order: "
"1. CurrentPTO: Tell the user how much PTO they currently have ",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to check my time off amounts at the current time",
"agent: Sure! I can help you out with that. May I have your email address?",
"user: bob.johnson@emailzzz.com",
"agent: Great! I can tell you how much PTO you currently have accrued.",
"user_confirmed_tool_run: <user clicks confirm on CurrentPTO tool>",
"tool_result: { 'num_hours': 400, 'num_days': 50 }",
"agent: You have 400 hours, or 50 days, of PTO available.",
]
),
)
goal_hr_check_paycheck_bank_integration_status = AgentGoal(
id="goal_hr_check_paycheck_bank_integration_status",
category_tag="hr",
agent_name="Check paycheck deposit status",
agent_friendly_description="Check your integration between your employer and your financial institution.",
tools=[
tool_registry.paycheck_bank_integration_status_check,
],
description="The user wants to check their bank integration used to deposit their paycheck. To assist with that goal, help the user gather args for these tools in order: "
"1. CheckPayBankStatus: Tell the user the status of their paycheck bank integration ",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to check paycheck bank integration",
"agent: Sure! I can help you out with that. May I have your email address?",
"user: bob.johnson@emailzzz.com",
"agent: Great! I can tell you what the status is for your paycheck bank integration.",
"user_confirmed_tool_run: <user clicks confirm on CheckPayBankStatus tool>",
"tool_result: { 'status': connected }",
"agent: Your paycheck bank deposit integration is properly connected.",
]
),
)
hr_goals: List[AgentGoal] = [
goal_hr_schedule_pto,
goal_hr_check_pto,
goal_hr_check_paycheck_bank_integration_status,
]

37
goals/stripe_mcp.py Normal file
View File

@@ -0,0 +1,37 @@
from typing import List
from models.tool_definitions import AgentGoal
from shared.mcp_config import get_stripe_mcp_server_definition
starter_prompt_generic = "Welcome me, give me a description of what you can do, then ask me for the details you need to do your job."
goal_mcp_stripe = AgentGoal(
id="goal_mcp_stripe",
category_tag="mcp-integrations",
agent_name="Stripe MCP Agent",
agent_friendly_description="Manage Stripe operations via MCP",
tools=[], # Will be populated dynamically
mcp_server_definition=get_stripe_mcp_server_definition(included_tools=[]),
description="Help manage Stripe operations for customer and product data by using the customers.read and products.read tools.",
starter_prompt="Welcome! I can help you read Stripe customer and product information.",
example_conversation_history="\n ".join(
[
"agent: Welcome! I can help you read Stripe customer and product information. What would you like to do first?",
"user: what customers are there?",
"agent: I'll check for customers now.",
"user_confirmed_tool_run: <user clicks confirm on customers.read tool>",
'tool_result: { "customers": [{"id": "cus_abc", "name": "Customer A"}, {"id": "cus_xyz", "name": "Customer B"}] }',
"agent: I found two customers: Customer A and Customer B. Can I help with anything else?",
"user: what products exist?",
"agent: Let me get the list of products for you.",
"user_confirmed_tool_run: <user clicks confirm on products.read tool>",
'tool_result: { "products": [{"id": "prod_123", "name": "Gold Plan"}, {"id": "prod_456", "name": "Silver Plan"}] }',
"agent: I found two products: Gold Plan and Silver Plan.",
]
),
)
mcp_goals: List[AgentGoal] = [
goal_mcp_stripe,
]

View File

@@ -1,29 +1,30 @@
from typing import List
import tools.tool_registry as tool_registry
from models.tool_definitions import AgentGoal
from tools.tool_registry import (
search_fixtures_tool,
search_flights_tool,
search_trains_tool,
book_trains_tool,
create_invoice_tool,
find_events_tool,
)
starter_prompt_generic = "Welcome me, give me a description of what you can do, then ask me for the details you need to do your job."
goal_match_train_invoice = AgentGoal(
id="goal_match_train_invoice",
category_tag="travel-trains",
agent_name="UK Premier League Match Trip Booking",
agent_friendly_description="Book a trip to a city in the UK around the dates of a premier league match.",
tools=[
search_fixtures_tool,
search_trains_tool,
book_trains_tool,
create_invoice_tool,
tool_registry.search_fixtures_tool,
tool_registry.search_trains_tool,
tool_registry.book_trains_tool,
tool_registry.create_invoice_tool,
],
description="The user wants to book a trip to a city in the UK around the dates of a premier league match. "
"Help the user find a premier league match to attend, search and book trains for that match and offers to invoice them for the cost of train tickets. "
"The user lives in London. "
"The user lives in London. Premier league fixtures may be mocked data, so don't worry about valid season dates and teams. "
"Gather args for these tools in order, ensuring you move the user from one tool to the next: "
"1. SearchFixtures: Search for fixtures for a team within a specified date range. The user might ask questions about the matches dates and locations to decide on where to go. "
"2. SearchTrains: Search for trains to the city of the match and list them for the customer to choose from "
"2. SearchTrains: Search for trains to the city of the match. Ensure you list them for the customer to choose from "
"3. BookTrains: Book the train tickets, used to invoice the user for the cost of the train tickets "
"4. CreateInvoice: Invoices the user for the cost of train tickets, with total and details inferred from the conversation history ",
starter_prompt="Welcome me, give me a description of what you can do, then ask me for the details you need to begin your job as an agent ",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to travel to a premier league match",
@@ -31,7 +32,7 @@ goal_match_train_invoice = AgentGoal(
"user: Wolves between mid-April and the first week of May, please."
"agent: Great! Let's find fixtures for Wolverhampton Wanderers FC between 2025-04-15 and 2025-05-05."
"user_confirmed_tool_run: <user clicks confirm on SearchFixtures tool, passing the full team name as an input>",
'tool_result: SearchFixtures responds in this JSON format and the results are parsed by the agent below {"fixtures": [{"awayTeam": "West Ham United FC","date": "2025-04-01","homeTeam": "Wolverhampton Wanderers FC"},...,'
'tool_result: SearchFixtures responds in this JSON format and the results are parsed by the agent below {"fixtures": [{"awayTeam": "West Ham United FC","date": "2025-04-01","homeTeam": "Wolverhampton Wanderers FC"},...,',
"agent: I found 4 Wolves matches in May 2025: 1. Man City vs Wolves (Away) - May 3 2. Wolves vs Brighton (Home) - May 10 3. Crystal Palace vs Wolves (Away) - May 18 4. Wolves vs Brentford (Home) - May 25 Which match would you like to attend? I'll help you find trains from London.",
"user: How about the Man City match",
"agent: Great! Let's find trains to Manchester and back for the match.",
@@ -51,18 +52,21 @@ goal_match_train_invoice = AgentGoal(
),
)
# unused
goal_event_flight_invoice = AgentGoal(
id="goal_event_flight_invoice",
category_tag="travel-flights",
agent_name="Australia and New Zealand Event Flight Booking",
agent_friendly_description="Book a trip to a city in Australia or New Zealand around the dates of events in that city.",
tools=[
find_events_tool,
search_flights_tool,
create_invoice_tool,
tool_registry.find_events_tool,
tool_registry.search_flights_tool,
tool_registry.create_invoice_tool,
],
description="Help the user gather args for these tools in order: "
"1. FindEvents: Find an event to travel to "
"1. FindEvents: Find an event in Australia or New Zealand to travel to "
"2. SearchFlights: search for a flight around the event dates "
"3. CreateInvoice: Create a simple invoice for the cost of that flight ",
starter_prompt="Welcome me, give me a description of what you can do, then ask me for the details you need to do your job",
starter_prompt=starter_prompt_generic,
example_conversation_history="\n ".join(
[
"user: I'd like to travel to an event",
@@ -85,3 +89,8 @@ goal_event_flight_invoice = AgentGoal(
]
),
)
travel_goals: List[AgentGoal] = [
goal_match_train_invoice,
goal_event_flight_invoice,
]

View File

@@ -1,5 +1,6 @@
from dataclasses import dataclass
from typing import Optional, Deque, Dict, Any, List, Union, Literal
from typing import Any, Deque, Dict, List, Literal, Optional, Union
from models.tool_definitions import AgentGoal
@@ -17,7 +18,7 @@ class CombinedInput:
Message = Dict[str, Union[str, Dict[str, Any]]]
ConversationHistory = Dict[str, List[Message]]
NextStep = Literal["confirm", "question", "done"]
NextStep = Literal["confirm", "question", "pick-new-goal", "done"]
@dataclass
@@ -42,3 +43,15 @@ class ValidationResult:
# Initialize empty dict if None
if self.validationFailedReason is None:
self.validationFailedReason = {}
@dataclass
class EnvLookupInput:
show_confirm_env_var_name: str
show_confirm_default: bool
@dataclass
class EnvLookupOutput:
show_confirm: bool
multi_goal_mode: bool

View File

@@ -1,5 +1,17 @@
from dataclasses import dataclass
from typing import List
from typing import Dict, List, Optional
@dataclass
class MCPServerDefinition:
"""Definition for an MCP (Model Context Protocol) server connection"""
name: str
command: str
args: List[str]
env: Optional[Dict[str, str]] = None
connection_type: str = "stdio"
included_tools: Optional[List[str]] = None
@dataclass
@@ -18,9 +30,12 @@ class ToolDefinition:
@dataclass
class AgentGoal:
id: str
category_tag: str
agent_name: str
agent_friendly_description: str
tools: List[ToolDefinition]
description: str = "Description of the tools purpose and overall goal"
starter_prompt: str = "Initial prompt to start the conversation"
example_conversation_history: str = (
"Example conversation history to help the AI agent understand the context of the conversation"
)
example_conversation_history: str = "Example conversation history to help the AI agent understand the context of the conversation"
mcp_server_definition: Optional[MCPServerDefinition] = None

1992
poetry.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,21 +1,30 @@
from models.tool_definitions import AgentGoal
from typing import Optional
import json
from typing import Optional
from models.tool_definitions import AgentGoal
MULTI_GOAL_MODE: bool = None
def generate_genai_prompt(
agent_goal: AgentGoal, conversation_history: str, raw_json: Optional[str] = None
agent_goal: AgentGoal,
conversation_history: str,
multi_goal_mode: bool,
raw_json: Optional[str] = None,
mcp_tools_info: Optional[dict] = None,
) -> str:
"""
Generates a concise prompt for producing or validating JSON instructions
with the provided tools and conversation history.
"""
prompt_lines = []
set_multi_goal_mode_if_unset(multi_goal_mode)
# Intro / Role
prompt_lines.append(
"You are an AI agent that helps fill required arguments for the tools described below. "
"You must respond with valid JSON ONLY, using the schema provided in the instructions."
"CRITICAL: You must respond with ONLY valid JSON using the exact schema provided. "
"DO NOT include any text before or after the JSON. Your entire response must be parseable JSON."
)
# Main Conversation History
@@ -23,10 +32,12 @@ def generate_genai_prompt(
prompt_lines.append(
"This is the ongoing history to determine which tool and arguments to gather:"
)
prompt_lines.append("BEGIN CONVERSATION HISTORY")
prompt_lines.append("*BEGIN CONVERSATION HISTORY*")
prompt_lines.append(json.dumps(conversation_history, indent=2))
prompt_lines.append("END CONVERSATION HISTORY")
prompt_lines.append("")
prompt_lines.append("*END CONVERSATION HISTORY*")
prompt_lines.append(
"REMINDER: You can use the conversation history to infer arguments for the tools."
)
# Example Conversation History (from agent_goal)
if agent_goal.example_conversation_history:
@@ -39,11 +50,35 @@ def generate_genai_prompt(
prompt_lines.append("END EXAMPLE")
prompt_lines.append("")
# Add MCP server context if present
if agent_goal.mcp_server_definition:
prompt_lines.append("=== MCP Server Information ===")
prompt_lines.append(
f"Connected to MCP Server: {agent_goal.mcp_server_definition.name}"
)
if mcp_tools_info and mcp_tools_info.get("success", False):
tools = mcp_tools_info.get("tools", {})
server_name = mcp_tools_info.get("server_name", "Unknown")
prompt_lines.append(
f"MCP Tools loaded from {server_name} ({len(tools)} tools):"
)
for tool_name, tool_info in tools.items():
prompt_lines.append(
f" - {tool_name}: {tool_info.get('description', 'No description')}"
)
else:
prompt_lines.append("Additional tools available via MCP integration:")
prompt_lines.append("")
# Tools Definitions
prompt_lines.append("=== Tools Definitions ===")
prompt_lines.append(f"There are {len(agent_goal.tools)} available tools:")
prompt_lines.append(", ".join([t.name for t in agent_goal.tools]))
prompt_lines.append(f"Goal: {agent_goal.description}")
prompt_lines.append(
"CRITICAL: You MUST follow the complete sequence described in the Goal above. "
"Do NOT skip steps or assume the goal is complete until ALL steps are done."
)
prompt_lines.append(
"Gather the necessary information for each tool in the sequence described above."
)
@@ -63,27 +98,58 @@ def generate_genai_prompt(
)
# JSON Format Instructions
prompt_lines.append("=== Instructions for JSON Generation ===")
prompt_lines.append("=== CRITICAL: JSON-ONLY RESPONSE FORMAT ===")
prompt_lines.append(
"Your JSON format must be:\n"
"MANDATORY: Your response must be ONLY valid JSON with NO additional text.\n"
"NO explanations, NO comments, NO text before or after the JSON.\n"
"Your entire response must start with '{' and end with '}'.\n\n"
"Required JSON format:\n"
"{\n"
' "response": "<plain text>",\n'
' "next": "<question|confirm|done>",\n'
' "next": "<question|confirm|pick-new-goal|done>",\n'
' "tool": "<tool_name or null>",\n'
' "args": {\n'
' "<arg1>": "<value1 or null>",\n'
' "<arg2>": "<value2 or null>",\n'
" ...\n"
" }\n"
"}"
"}\n\n"
"INVALID EXAMPLE: 'Thank you for providing... {\"response\": ...}'\n"
'VALID EXAMPLE: \'{"response": "Thank you for providing...", "next": ...}\''
)
prompt_lines.append(
"1) If any required argument is missing, set next='question' and ask the user.\n"
"2) If all required arguments are known, set next='confirm' and specify the tool.\n"
" The user will confirm before the tool is run.\n"
"3) If no more tools are needed (user_confirmed_tool_run has been run for all), set next='done' and tool=null.\n"
"4) response should be short and user-friendly.\n"
"5) Don't set next='done' until the final tool has returned user_confirmed_tool_run.\n"
"DECISION LOGIC (follow this exact order):\n"
"1) Do I need to run a tool next?\n"
" - If your response says 'let's get/proceed/check/add/create/finalize...' -> YES, you need a tool\n"
" - If you're announcing what you're about to do -> YES, you need a tool\n"
" - If no more steps needed for current goal -> NO, go to step 3\n\n"
"2) If YES to step 1: Do I have all required arguments?\n"
" - Check tool definition for required args\n"
" - Can I fill missing args from conversation history?\n"
" - Can I use sensible defaults (limit=100, etc.)?\n"
" - If ALL args available/inferrable -> set next='confirm', specify tool and args\n"
" - If missing required args -> set next='question', ask for missing args, tool=null\n\n"
"3) If NO to step 1: Is the entire goal complete?\n"
" - Check Goal description in system prompt - are ALL steps done?\n"
" - Check recent conversation for completion indicators ('finalized', 'complete', etc.)\n"
f" - If complete -> {generate_toolchain_complete_guidance()}\n"
" - If not complete -> identify next needed tool, go to step 2\n\n"
"CRITICAL RULES:\n"
"• RESPOND WITH JSON ONLY - NO TEXT BEFORE OR AFTER THE JSON OBJECT\n"
"• Your response must start with '{' and end with '}' - nothing else\n"
"• NEVER set next='question' without asking an actual question in your response\n"
"• NEVER set tool=null when you're announcing you'll run a specific tool\n"
"• If response contains 'let's proceed to get pricing' -> next='confirm', tool='list_prices'\n"
"• If response contains 'Now adding X' -> next='confirm', tool='create_invoice_item'\n"
"• Use conversation history to infer arguments (customer IDs, product IDs, etc.)\n"
"• Use sensible defaults rather than asking users for technical parameters\n"
"• Carry forward arguments between tools (same customer, same invoice, etc.)\n"
"• If force_confirm='False' in history, be declarative, don't ask permission\n\n"
"EXAMPLES:\n"
"WRONG: response='let\\'s get pricing', next='question', tool=null\n"
"RIGHT: response='let\\'s get pricing', next='confirm', tool='list_prices'\n"
"WRONG: response='adding pizza', next='question', tool='create_invoice_item'\n"
"RIGHT: response='adding pizza', next='confirm', tool='create_invoice_item'\n"
)
# Validation Task (If raw_json is provided)
@@ -100,46 +166,55 @@ def generate_genai_prompt(
# Prompt Start
prompt_lines.append("")
prompt_lines.append("=== FINAL REMINDER ===")
prompt_lines.append("RESPOND WITH VALID JSON ONLY. NO ADDITIONAL TEXT.")
prompt_lines.append("")
if raw_json is not None:
prompt_lines.append("Begin by validating the provided JSON if necessary.")
prompt_lines.append(
"Validate the provided JSON and return ONLY corrected JSON."
)
else:
prompt_lines.append(
"Begin by producing a valid JSON response for the next tool or question."
"Return ONLY a valid JSON response. Start with '{' and end with '}'."
)
return "\n".join(prompt_lines)
def generate_tool_completion_prompt(current_tool: str, dynamic_result: dict) -> str:
"""
Generates a prompt for handling tool completion and determining next steps.
Args:
current_tool: The name of the tool that just completed
dynamic_result: The result data from the tool execution
Returns:
str: A formatted prompt string for the agent to process the tool completion
"""
return (
f"### The '{current_tool}' tool completed successfully with {dynamic_result}. "
"INSTRUCTIONS: Parse this tool result as plain text, and use the system prompt containing the list of tools in sequence and the conversation history (and previous tool_results) to figure out next steps, if any. "
"You will need to use the tool_results to auto-fill arguments for subsequent tools and also to figure out if all tools have been run."
'{"next": "<question|confirm|done>", "tool": "<tool_name or null>", "args": {"<arg1>": "<value1 or null>", "<arg2>": "<value2 or null>}, "response": "<plain text (can include \\n line breaks)>"}'
"ONLY return those json keys (next, tool, args, response), nothing else."
'Next should only be "done" if all tools have been run (use the system prompt to figure that out).'
'Next should be "question" if the tool is not the last one in the sequence.'
'Next should NOT be "confirm" at this point.'
"You will need to use the tool_results to auto-fill arguments for subsequent tools and also to figure out if all tools have been run. "
'{"next": "<question|confirm|pick-new-goal|done>", "tool": "<tool_name or null>", "args": {"<arg1>": "<value1 or null>", "<arg2>": "<value2 or null>}, "response": "<plain text (can include \\n line breaks)>"}'
"ONLY return those json keys (next, tool, args, response), nothing else. "
'Next should be "question" if the tool is not the last one in the sequence. '
'Next should be "done" if the user is asking to be done with the chat. '
f"{generate_pick_new_goal_guidance()}"
)
def generate_missing_args_prompt(current_tool: str, tool_data: dict, missing_args: list[str]) -> str:
def generate_missing_args_prompt(
current_tool: str, tool_data: dict, missing_args: list[str]
) -> str:
"""
Generates a prompt for handling missing arguments for a tool.
Args:
current_tool: The name of the tool that needs arguments
tool_data: The current tool data containing the response
missing_args: List of argument names that are missing
Returns:
str: A formatted prompt string for requesting missing arguments
"""
@@ -148,3 +223,63 @@ def generate_missing_args_prompt(current_tool: str, tool_data: dict, missing_arg
f"and following missing arguments for tool {current_tool}: {missing_args}. "
"Only provide a valid JSON response without any comments or metadata."
)
def set_multi_goal_mode_if_unset(mode: bool) -> None:
"""
Set multi-mode (used to pass workflow)
Args:
None
Returns:
bool: True if in multi-goal mode, false if not
"""
global MULTI_GOAL_MODE
if MULTI_GOAL_MODE is None:
MULTI_GOAL_MODE = mode
def is_multi_goal_mode() -> bool:
"""
Centralized logic for if we're in multi-goal mode.
Args:
None
Returns:
bool: True if in multi-goal mode, false if not
"""
return MULTI_GOAL_MODE
def generate_pick_new_goal_guidance() -> str:
"""
Generates a prompt for guiding the LLM to pick a new goal or be done depending on multi-goal mode.
Args:
None
Returns:
str: A prompt string prompting the LLM to when to go to pick-new-goal
"""
if is_multi_goal_mode():
return 'Next should only be "pick-new-goal" if EVERY SINGLE STEP in the Goal description has been completed (check the system prompt Goal section carefully), or the user explicitly requested to pick a new goal. If any step is missing (like customer creation, invoice creation, or payment processing), continue with the next required tool.'
else:
return 'Next should never be "pick-new-goal".'
def generate_toolchain_complete_guidance() -> str:
"""
Generates a prompt for guiding the LLM to handle the end of the toolchain.
Args:
None
Returns:
str: A prompt string prompting the LLM to prompt for a new goal, or be done
"""
if is_multi_goal_mode():
return "If no more tools are needed for the current goal (EVERY step in the Goal description has been completed AND user_confirmed_tool_run has been run for all required tools), set next='pick-new-goal' and tool=null to allow the user to choose their next action."
else:
return "If no more tools are needed (EVERY step in the Goal description has been completed AND user_confirmed_tool_run has been run for all), set next='done' and tool=null."

View File

@@ -1,19 +1,29 @@
[tool.poetry]
name = "temporal-AI-agent"
version = "0.1.0"
name = "temporal_AI_agent"
version = "0.2.0"
description = "Temporal AI Agent"
license = "MIT"
authors = ["Steve Androulakis <steve.androulakis@temporal.io>"]
authors = [
"Steve Androulakis <steve.androulakis@temporal.io>",
"Laine Smith <lainecaseysmith@gmail.com>",
"Joshua Smith <josh.smith@temporal.io>"
]
readme = "README.md"
# By default, Poetry will find packages automatically,
# By default, Poetry will find packages automatically,
# but explicitly including them is fine:
packages = [
{ include = "**/*.py", from = "." }
]
[tool.poetry.urls]
"Bug Tracker" = "https://github.com/temporalio/samples-python/issues"
"Bug Tracker" = "https://github.com/temporal-community/temporal-ai-agent/issues"
[tool.poe.tasks]
format = [{cmd = "black ."}, {cmd = "isort ."}]
lint = [{cmd = "black --check ."}, {cmd = "isort --check-only ."}, {ref = "lint-types" }]
lint-types = "mypy --check-untyped-defs --namespace-packages ."
test = "pytest"
[tool.poetry.dependencies]
python = ">=3.10,<4.0"
@@ -21,25 +31,43 @@ temporalio = "^1.8.0"
# Standard library modules (e.g. asyncio, collections) don't need to be added
# since they're built-in for Python 3.8+.
ollama = "^0.4.5"
litellm = "^1.70.0"
pyyaml = "^6.0.2"
fastapi = "^0.115.6"
uvicorn = "^0.34.0"
python-dotenv = "^1.0.1"
openai = "^1.59.2"
stripe = "^11.4.1"
google-generativeai = "^0.8.4"
anthropic = "0.47.0"
deepseek = "^1.0.0"
requests = "^2.32.3"
pandas = "^2.2.3"
stripe = "^11.4.1"
gtfs-kit = "^10.1.1"
fastmcp = "^2.7.0"
[tool.poetry.group.dev.dependencies]
pytest = "^7.3"
pytest = ">=8.2"
pytest-asyncio = "^0.26.0"
black = "^23.7"
isort = "^5.12"
mypy = "^1.16.0"
[build-system]
requires = ["poetry-core>=1.4.0"]
build-backend = "poetry.core.masonry.api"
build-backend = "poetry.core.masonry.api"
[tool.pytest.ini_options]
asyncio_mode = "auto"
log_cli = true
log_cli_level = "INFO"
log_cli_format = "%(asctime)s [%(levelname)8s] %(message)s (%(filename)s:%(lineno)s)"
asyncio_default_fixture_loop_scope = "function"
norecursedirs = ["vibe"]
[tool.mypy]
python_version = "3.10"
ignore_missing_imports = true
check_untyped_defs = true
namespace_packages = true
explicit_package_bases = true
ignore_errors = true
[tool.isort]
profile = "black"

View File

@@ -1,12 +1,12 @@
import asyncio
from shared.config import get_temporal_client
from workflows.agent_goal_workflow import AgentGoalWorkflow
async def main():
# Create client connected to server at the given address
client = await Client.connect("localhost:7233")
client = await get_temporal_client()
workflow_id = "agent-workflow"

View File

@@ -1,8 +1,9 @@
from tools.search_events import find_events
import json
from tools.search_flights import search_flights
# Example usage
if __name__ == "__main__":
search_args = {"city": "Sydney", "month": "July"}
results = find_events(search_args)
results = search_flights(search_args)
print(json.dumps(results, indent=2))

View File

@@ -1,6 +1,7 @@
from tools.search_flights import search_flights
import json
from tools.search_flights import search_flights
if __name__ == "__main__":
# Suppose user typed "new" for New York, "lon" for London
flights = search_flights("London", "JFK", "2025-01-15", "2025-01-23")

View File

@@ -1,12 +1,10 @@
import asyncio
import concurrent.futures
from temporalio.worker import Worker
from activities.tool_activities import dynamic_tool_activity
from shared.config import get_temporal_client, TEMPORAL_LEGACY_TASK_QUEUE
from shared.config import TEMPORAL_LEGACY_TASK_QUEUE, get_temporal_client
async def main():
@@ -24,7 +22,9 @@ async def main():
activity_executor=activity_executor,
)
print(f"Starting legacy worker, connecting to task queue: {TEMPORAL_LEGACY_TASK_QUEUE}")
print(
f"Starting legacy worker, connecting to task queue: {TEMPORAL_LEGACY_TASK_QUEUE}"
)
await worker.run()

View File

@@ -1,23 +0,0 @@
from ollama import chat, ChatResponse
def main():
model_name = "mistral"
# The messages to pass to the model
messages = [
{
"role": "user",
"content": "Why is the sky blue?",
}
]
# Call ollama's chat function
response: ChatResponse = chat(model=model_name, messages=messages)
# Print the full message content
print(response.message.content)
if __name__ == "__main__":
main()

View File

@@ -1,33 +1,41 @@
import asyncio
import concurrent.futures
import logging
import os
from dotenv import load_dotenv
from dotenv import load_dotenv
from temporalio.worker import Worker
from activities.tool_activities import ToolActivities, dynamic_tool_activity
from activities.tool_activities import (
ToolActivities,
dynamic_tool_activity,
mcp_list_tools,
)
from shared.config import TEMPORAL_TASK_QUEUE, get_temporal_client
from shared.mcp_client_manager import MCPClientManager
from workflows.agent_goal_workflow import AgentGoalWorkflow
from shared.config import get_temporal_client, TEMPORAL_TASK_QUEUE
async def main():
# Load environment variables
load_dotenv(override=True)
# Print LLM configuration info
llm_provider = os.environ.get("LLM_PROVIDER", "openai").lower()
print(f"Worker will use LLM provider: {llm_provider}")
llm_model = os.environ.get("LLM_MODEL", "openai/gpt-4")
print(f"Worker will use LLM model: {llm_model}")
# Create shared MCP client manager
mcp_client_manager = MCPClientManager()
# Create the client
client = await get_temporal_client()
# Initialize the activities class once with the specified LLM provider
activities = ToolActivities()
print(f"ToolActivities initialized with LLM provider: {llm_provider}")
# Initialize the activities class with injected manager
activities = ToolActivities(mcp_client_manager)
print(f"ToolActivities initialized with LLM model: {llm_model}")
# If using Ollama, pre-load the model to avoid cold start latency
if llm_provider == "ollama":
if llm_model.startswith("ollama"):
print("\n======== OLLAMA MODEL INITIALIZATION ========")
print("Ollama models need to be loaded into memory on first use.")
print("This may take 30+ seconds depending on your hardware and model size.")
@@ -48,23 +56,33 @@ async def main():
print("===========================================================\n")
print("Worker ready to process tasks!")
logging.basicConfig(level=logging.INFO)
# Run the worker
with concurrent.futures.ThreadPoolExecutor(max_workers=100) as activity_executor:
worker = Worker(
client,
task_queue=TEMPORAL_TASK_QUEUE,
workflows=[AgentGoalWorkflow],
activities=[
activities.agent_validatePrompt,
activities.agent_toolPlanner,
dynamic_tool_activity,
],
activity_executor=activity_executor,
)
# Run the worker with proper cleanup
try:
with concurrent.futures.ThreadPoolExecutor(
max_workers=100
) as activity_executor:
worker = Worker(
client,
task_queue=TEMPORAL_TASK_QUEUE,
workflows=[AgentGoalWorkflow],
activities=[
activities.agent_validatePrompt,
activities.agent_toolPlanner,
activities.get_wf_env_vars,
activities.mcp_tool_activity,
dynamic_tool_activity,
mcp_list_tools,
],
activity_executor=activity_executor,
)
print(f"Starting worker, connecting to task queue: {TEMPORAL_TASK_QUEUE}")
await worker.run()
print(f"Starting worker, connecting to task queue: {TEMPORAL_TASK_QUEUE}")
await worker.run()
finally:
# Cleanup MCP connections when worker shuts down
await mcp_client_manager.cleanup()
if __name__ == "__main__":

View File

@@ -5,7 +5,6 @@ from shared.config import get_temporal_client
async def main():
# Connect to Temporal and signal the workflow
client = await get_temporal_client()

View File

@@ -1,4 +1,5 @@
import os
from dotenv import load_dotenv
from temporalio.client import Client
from temporalio.service import TLSConfig
@@ -9,7 +10,9 @@ load_dotenv(override=True)
TEMPORAL_ADDRESS = os.getenv("TEMPORAL_ADDRESS", "localhost:7233")
TEMPORAL_NAMESPACE = os.getenv("TEMPORAL_NAMESPACE", "default")
TEMPORAL_TASK_QUEUE = os.getenv("TEMPORAL_TASK_QUEUE", "agent-task-queue")
TEMPORAL_LEGACY_TASK_QUEUE = os.getenv("TEMPORAL_LEGACY_TASK_QUEUE", "agent-task-queue-legacy")
TEMPORAL_LEGACY_TASK_QUEUE = os.getenv(
"TEMPORAL_LEGACY_TASK_QUEUE", "agent-task-queue-legacy"
)
# Authentication settings
TEMPORAL_TLS_CERT = os.getenv("TEMPORAL_TLS_CERT", "")

View File

@@ -0,0 +1,167 @@
import asyncio
from contextlib import asynccontextmanager
from typing import TYPE_CHECKING, Any, Dict, Tuple
from temporalio import activity
from models.tool_definitions import MCPServerDefinition
# Import MCP client libraries
if TYPE_CHECKING:
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
else:
try:
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
except ImportError:
# Fallback if MCP not installed
ClientSession = None
StdioServerParameters = None
stdio_client = None
class MCPClientManager:
"""Manages pooled MCP client connections for reuse across tool calls"""
def __init__(self):
self._clients: Dict[str, Any] = {}
self._connections: Dict[str, Tuple[Any, Any]] = {}
self._lock = asyncio.Lock()
async def get_client(
self, server_def: MCPServerDefinition | Dict[str, Any] | None
) -> Any:
"""Return existing client or create new one, keyed by server definition hash"""
async with self._lock:
key = self._get_server_key(server_def)
if key not in self._clients:
await self._create_client(server_def, key)
activity.logger.info(
f"Created new MCP client for {self._get_server_name(server_def)}"
)
else:
activity.logger.info(
f"Reusing existing MCP client for {self._get_server_name(server_def)}"
)
return self._clients[key]
def _get_server_key(
self, server_def: MCPServerDefinition | Dict[str, Any] | None
) -> str:
"""Generate unique key for server definition"""
if server_def is None:
return "default:python:server.py"
# Handle both MCPServerDefinition objects and dicts (from Temporal serialization)
if isinstance(server_def, dict):
name = server_def.get("name", "default")
command = server_def.get("command", "python")
args = server_def.get("args", ["server.py"])
else:
name = server_def.name
command = server_def.command
args = server_def.args
return f"{name}:{command}:{':'.join(args)}"
def _get_server_name(
self, server_def: MCPServerDefinition | Dict[str, Any] | None
) -> str:
"""Get server name for logging"""
if server_def is None:
return "default"
if isinstance(server_def, dict):
return server_def.get("name", "default")
else:
return server_def.name
def _build_connection(
self, server_def: MCPServerDefinition | Dict[str, Any] | None
) -> Dict[str, Any]:
"""Build connection parameters from MCPServerDefinition or dict"""
if server_def is None:
# Default to stdio connection with the main server
return {
"type": "stdio",
"command": "python",
"args": ["server.py"],
"env": {},
}
# Handle both MCPServerDefinition objects and dicts (from Temporal serialization)
if isinstance(server_def, dict):
return {
"type": server_def.get("connection_type", "stdio"),
"command": server_def.get("command", "python"),
"args": server_def.get("args", ["server.py"]),
"env": server_def.get("env", {}) or {},
}
return {
"type": server_def.connection_type,
"command": server_def.command,
"args": server_def.args,
"env": server_def.env or {},
}
@asynccontextmanager
async def _stdio_connection(self, command: str, args: list, env: dict):
"""Create stdio connection to MCP server"""
if stdio_client is None:
raise Exception("MCP client libraries not available")
# Create server parameters
server_params = StdioServerParameters(command=command, args=args, env=env)
async with stdio_client(server_params) as (read, write):
yield read, write
async def _create_client(
self, server_def: MCPServerDefinition | Dict[str, Any] | None, key: str
):
"""Create and store new client connection"""
connection = self._build_connection(server_def)
if connection["type"] == "stdio":
# Create stdio connection
connection_manager = self._stdio_connection(
command=connection.get("command", "python"),
args=connection.get("args", ["server.py"]),
env=connection.get("env", {}),
)
# Enter the connection context
read, write = await connection_manager.__aenter__()
# Create and initialize client session
session = ClientSession(read, write)
await session.initialize()
# Store both the session and connection manager for cleanup
self._clients[key] = session
self._connections[key] = (connection_manager, read, write)
else:
raise Exception(f"Unsupported connection type: {connection['type']}")
async def cleanup(self):
"""Close all connections gracefully"""
async with self._lock:
# Close all client sessions
for session in self._clients.values():
try:
await session.close()
except Exception as e:
activity.logger.warning(f"Error closing MCP session: {e}")
# Exit all connection contexts
for connection_manager, read, write in self._connections.values():
try:
await connection_manager.__aexit__(None, None, None)
except Exception as e:
activity.logger.warning(f"Error closing MCP connection: {e}")
self._clients.clear()
self._connections.clear()
activity.logger.info("All MCP connections closed")

27
shared/mcp_config.py Normal file
View File

@@ -0,0 +1,27 @@
import os
from models.tool_definitions import MCPServerDefinition
def get_stripe_mcp_server_definition(included_tools: list[str]) -> MCPServerDefinition:
"""
Returns a Stripe MCP server definition with customizable included tools.
Args:
included_tools: List of tool names to include from the Stripe MCP server
Returns:
MCPServerDefinition configured for Stripe
"""
return MCPServerDefinition(
name="stripe-mcp",
command="npx",
args=[
"-y",
"@stripe/mcp",
"--tools=all",
f"--api-key={os.getenv('STRIPE_API_KEY')}",
],
env=None,
included_tools=included_tools,
)

350
tests/README.md Normal file
View File

@@ -0,0 +1,350 @@
# Temporal AI Agent - Testing Guide
This directory contains comprehensive tests for the Temporal AI Agent project. The tests cover workflows, activities, and integration scenarios using Temporal's testing framework.
## Test Structure
```
tests/
├── README.md # This file - testing documentation
├── conftest.py # Test configuration and fixtures
├── test_agent_goal_workflow.py # Workflow tests
├── test_tool_activities.py # Activity tests
└── workflowtests/ # Legacy workflow tests
└── agent_goal_workflow_test.py
```
## Test Types
### 1. Workflow Tests (`test_agent_goal_workflow.py`)
Tests the main `AgentGoalWorkflow` class covering:
- **Workflow Initialization**: Basic workflow startup and state management
- **Signal Handling**: Testing user_prompt, confirm, end_chat signals
- **Query Methods**: Testing all workflow query endpoints
- **State Management**: Conversation history, goal changes, tool data
- **Validation Flow**: Prompt validation and error handling
- **Tool Execution Flow**: Confirmation and tool execution cycles
### 2. Activity Tests (`test_tool_activities.py`)
Tests the `ToolActivities` class and `dynamic_tool_activity` function:
- **LLM Integration**: Testing agent_toolPlanner with mocked LLM responses
- **Validation Logic**: Testing agent_validatePrompt with various scenarios
- **Environment Configuration**: Testing get_wf_env_vars with different env setups
- **JSON Processing**: Testing response parsing and sanitization
- **Dynamic Tool Execution**: Testing the dynamic activity dispatcher
- **Integration**: End-to-end activity execution in Temporal workers
### 3. Configuration Tests (`conftest.py`)
Provides shared test fixtures and configuration:
- **Temporal Environment**: Local and time-skipping test environments
- **Sample Data**: Pre-configured agent goals, conversation history, inputs
- **Test Client**: Configured Temporal client for testing
## Running Tests
### Prerequisites
Ensure you have the required dependencies installed:
```bash
poetry install --with dev
```
### Basic Test Execution
Run all tests:
```bash
poetry run pytest
```
Run specific test files:
```bash
# Workflow tests only
poetry run pytest tests/test_agent_goal_workflow.py
# Activity tests only
poetry run pytest tests/test_tool_activities.py
# Legacy tests
poetry run pytest tests/workflowtests/
```
Run with verbose output:
```bash
poetry run pytest -v
```
### Test Environment Options
The tests support different Temporal environments via the `--workflow-environment` flag:
#### Local Environment (Default)
Uses a local Temporal test server:
```bash
poetry run pytest --workflow-environment=local
```
#### Time-Skipping Environment
Uses Temporal's time-skipping test environment for faster execution:
```bash
poetry run pytest --workflow-environment=time-skipping
```
#### External Server
Connect to an existing Temporal server:
```bash
poetry run pytest --workflow-environment=localhost:7233
```
#### Setup Script for AI Agent environments such as OpenAI Codex
```bash
export SHELL=/bin/bash
curl -sSL https://install.python-poetry.org | python3 -
export PATH="$HOME/.local/bin:$PATH"
ls
poetry install --with dev
cd frontend
npm install
cd ..
# Pre-download the temporal test server binary
poetry run python3 -c "
import asyncio
import sys
from temporalio.testing import WorkflowEnvironment
async def predownload():
try:
print('Starting test server download...')
env = await WorkflowEnvironment.start_time_skipping()
print('Test server downloaded and started successfully')
await env.shutdown()
print('Test server shut down successfully')
except Exception as e:
print(f'Error during download: {e}')
sys.exit(1)
asyncio.run(predownload())
"
```
### Filtering Tests
Run tests by pattern:
```bash
# Run only validation tests
poetry run pytest -k "validation"
# Run only workflow tests
poetry run pytest -k "workflow"
# Run only activity tests
poetry run pytest -k "activity"
```
Run tests by marker (if you add custom markers):
```bash
# Run only integration tests
poetry run pytest -m integration
# Skip slow tests
poetry run pytest -m "not slow"
```
## Test Configuration
### Test Discovery
The `vibe/` directory is excluded from test collection to avoid conflicts with sample tests. This is configured in `pyproject.toml`:
```toml
[tool.pytest.ini_options]
norecursedirs = ["vibe"]
```
### Environment Variables
Tests respect the following environment variables:
- `LLM_MODEL`: Model to use for LLM testing (defaults to "openai/gpt-4")
- `LLM_KEY`: API key for LLM service
- `LLM_BASE_URL`: Custom base URL for LLM service
- `SHOW_CONFIRM`: Whether to show confirmation dialogs
- `AGENT_GOAL`: Default agent goal setting
### Mocking Strategy
The tests use extensive mocking to avoid external dependencies:
- **LLM Calls**: Mocked using `unittest.mock` to avoid actual API calls
- **Tool Handlers**: Mocked to test workflow logic without tool execution
- **Environment Variables**: Patched for consistent test environments
## Writing New Tests
### Test Naming Convention
- Test files: `test_<module_name>.py`
- Test classes: `Test<ClassName>`
- Test methods: `test_<functionality>_<scenario>`
Example:
```python
class TestAgentGoalWorkflow:
async def test_user_prompt_signal_valid_input(self, client, sample_combined_input):
# Test implementation
pass
```
### Using Fixtures
Leverage the provided fixtures for consistent test data:
```python
async def test_my_workflow(self, client, sample_agent_goal, sample_conversation_history):
# client: Temporal test client
# sample_agent_goal: Pre-configured AgentGoal
# sample_conversation_history: Sample conversation data
pass
```
### Mocking External Dependencies
Always mock external services:
```python
@patch('activities.tool_activities.completion')
async def test_llm_integration(self, mock_completion):
mock_completion.return_value.choices[0].message.content = '{"test": "response"}'
# Test implementation
```
### Testing Workflow Signals and Queries
```python
async def test_workflow_signal(self, client, sample_combined_input):
# Start workflow
handle = await client.start_workflow(
AgentGoalWorkflow.run,
sample_combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Send signal
await handle.signal(AgentGoalWorkflow.user_prompt, "test message")
# Query state
conversation = await handle.query(AgentGoalWorkflow.get_conversation_history)
# End workflow
await handle.signal(AgentGoalWorkflow.end_chat)
result = await handle.result()
```
## Test Data and Fixtures
### Sample Agent Goal
The `sample_agent_goal` fixture provides a basic agent goal with:
- Goal ID: "test_goal"
- One test tool with a required string argument
- Suitable for most workflow testing scenarios
### Sample Conversation History
The `sample_conversation_history` fixture provides:
- Basic user and agent message exchange
- Proper message format for testing
### Sample Combined Input
The `sample_combined_input` fixture provides:
- Complete workflow input with agent goal and tool params
- Conversation summary and prompt queue
- Ready for workflow execution
## Debugging Tests
### Verbose Logging
Enable detailed logging:
```bash
poetry run pytest --log-cli-level=DEBUG -s
```
### Temporal Web UI
When using local environment, access Temporal Web UI at http://localhost:8233 to inspect workflow executions during tests.
### Test Isolation
Each test uses unique task queue names to prevent interference:
```python
task_queue_name = str(uuid.uuid4())
```
## Continuous Integration
### GitHub Actions Example
```yaml
name: Test
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.10'
- run: pip install poetry
- run: poetry install --with dev
- run: poetry run pytest --workflow-environment=time-skipping
```
### Test Coverage
Generate coverage reports:
```bash
poetry add --group dev pytest-cov
poetry run pytest --cov=workflows --cov=activities --cov-report=html
```
## Best Practices
1. **Mock External Dependencies**: Always mock LLM calls, file I/O, and network requests
2. **Use Time-Skipping**: For CI/CD, prefer time-skipping environment for speed
3. **Unique Identifiers**: Use UUIDs for workflow IDs and task queues
4. **Clean Shutdown**: Always end workflows properly in tests
5. **Descriptive Names**: Use clear, descriptive test names
6. **Test Edge Cases**: Include error scenarios and validation failures
7. **Keep Tests Fast**: Use mocks to avoid slow external calls
8. **Isolate Tests**: Ensure tests don't depend on each other
## Troubleshooting
### Common Issues
1. **Workflow Timeout**: Increase timeouts or use time-skipping environment
2. **Mock Not Working**: Check patch decorators and import paths
3. **Test Hanging**: Ensure workflows are properly ended with signals
4. **Environment Issues**: Check environment variable settings
### Getting Help
- Check Temporal Python SDK documentation
- Review existing test patterns in the codebase
- Use `poetry run pytest --collect-only` to verify test discovery
- Run with `-v` flag for detailed output
## Legacy Tests
The `workflowtests/` directory contains legacy tests. New tests should be added to the main `tests/` directory following the patterns established in this guide.

0
tests/__init__.py Normal file
View File

111
tests/conftest.py Normal file
View File

@@ -0,0 +1,111 @@
import asyncio
import multiprocessing
import sys
from typing import AsyncGenerator
import pytest
import pytest_asyncio
from temporalio.client import Client
from temporalio.testing import WorkflowEnvironment
# Due to https://github.com/python/cpython/issues/77906, multiprocessing on
# macOS starting with Python 3.8 has changed from "fork" to "spawn". For
# pre-3.8, we are changing it for them.
if sys.version_info < (3, 8) and sys.platform.startswith("darwin"):
multiprocessing.set_start_method("spawn", True)
def pytest_addoption(parser):
parser.addoption(
"--workflow-environment",
default="local",
help="Which workflow environment to use ('local', 'time-skipping', or target to existing server)",
)
@pytest.fixture(scope="session")
def event_loop():
# See https://github.com/pytest-dev/pytest-asyncio/issues/68
# See https://github.com/pytest-dev/pytest-asyncio/issues/257
# Also need ProactorEventLoop on older versions of Python with Windows so
# that asyncio subprocess works properly
if sys.version_info < (3, 8) and sys.platform == "win32":
loop = asyncio.ProactorEventLoop()
else:
loop = asyncio.get_event_loop_policy().new_event_loop()
yield loop
loop.close()
@pytest_asyncio.fixture(scope="session")
async def env(request) -> AsyncGenerator[WorkflowEnvironment, None]:
env_type = request.config.getoption("--workflow-environment")
if env_type == "local":
env = await WorkflowEnvironment.start_local(
dev_server_extra_args=[
"--dynamic-config-value",
"frontend.enableExecuteMultiOperation=true",
]
)
elif env_type == "time-skipping":
env = await WorkflowEnvironment.start_time_skipping()
else:
env = WorkflowEnvironment.from_client(await Client.connect(env_type))
yield env
await env.shutdown()
@pytest_asyncio.fixture
async def client(env: WorkflowEnvironment) -> Client:
return env.client
@pytest.fixture
def sample_agent_goal():
"""Sample agent goal for testing."""
from models.tool_definitions import AgentGoal, ToolArgument, ToolDefinition
return AgentGoal(
id="test_goal",
category_tag="test",
agent_name="TestAgent",
agent_friendly_description="A test agent for testing purposes",
description="Test goal for agent testing",
tools=[
ToolDefinition(
name="TestTool",
description="A test tool for testing purposes",
arguments=[
ToolArgument(
name="test_arg", type="string", description="A test argument"
)
],
)
],
)
@pytest.fixture
def sample_conversation_history():
"""Sample conversation history for testing."""
return {
"messages": [
{"actor": "user", "response": "Hello, I need help with testing"},
{"actor": "agent", "response": "I can help you with that"},
]
}
@pytest.fixture
def sample_combined_input(sample_agent_goal):
"""Sample combined input for workflow testing."""
from collections import deque
from models.data_types import AgentGoalWorkflowParams, CombinedInput
tool_params = AgentGoalWorkflowParams(
conversation_summary="Test conversation summary",
prompt_queue=deque(), # Start with empty queue for most tests
)
return CombinedInput(agent_goal=sample_agent_goal, tool_params=tool_params)

View File

@@ -0,0 +1,543 @@
import uuid
from temporalio import activity
from temporalio.client import Client
from temporalio.worker import Worker
from models.data_types import (
AgentGoalWorkflowParams,
CombinedInput,
EnvLookupInput,
EnvLookupOutput,
ToolPromptInput,
ValidationInput,
ValidationResult,
)
from workflows.agent_goal_workflow import AgentGoalWorkflow
class TestAgentGoalWorkflow:
"""Test cases for AgentGoalWorkflow."""
async def test_workflow_initialization(
self, client: Client, sample_combined_input: CombinedInput
):
"""Test workflow can be initialized and started."""
task_queue_name = str(uuid.uuid4())
# Create mock activity functions with proper signatures
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[mock_get_wf_env_vars],
):
# Start workflow but don't wait for completion since it runs indefinitely
handle = await client.start_workflow(
AgentGoalWorkflow.run,
sample_combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Verify workflow is running
assert handle is not None
# Query the workflow to check initial state
conversation_history = await handle.query(
AgentGoalWorkflow.get_conversation_history
)
assert isinstance(conversation_history, dict)
assert "messages" in conversation_history
# Test goal query
agent_goal = await handle.query(AgentGoalWorkflow.get_agent_goal)
assert agent_goal == sample_combined_input.agent_goal
# End the workflow
await handle.signal(AgentGoalWorkflow.end_chat)
result = await handle.result()
assert isinstance(result, str)
async def test_user_prompt_signal(
self, client: Client, sample_combined_input: CombinedInput
):
"""Test user_prompt signal handling."""
task_queue_name = str(uuid.uuid4())
# Create mock activity functions with proper signatures
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
@activity.defn(name="agent_validatePrompt")
async def mock_agent_validatePrompt(
validation_input: ValidationInput,
) -> ValidationResult:
return ValidationResult(validationResult=True, validationFailedReason={})
@activity.defn(name="agent_toolPlanner")
async def mock_agent_toolPlanner(input: ToolPromptInput) -> dict:
return {"next": "done", "response": "Test response from LLM"}
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[
mock_get_wf_env_vars,
mock_agent_validatePrompt,
mock_agent_toolPlanner,
],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
sample_combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Send user prompt
await handle.signal(
AgentGoalWorkflow.user_prompt, "Hello, this is a test message"
)
# Wait for workflow to complete (it should end due to "done" next step)
result = await handle.result()
assert isinstance(result, str)
# Verify the conversation includes our message
import json
try:
conversation_history = json.loads(result.replace("'", '"'))
except Exception:
# Fallback to eval if json fails
conversation_history = eval(result)
messages = conversation_history["messages"]
# Should have our user message and agent response
user_messages = [msg for msg in messages if msg["actor"] == "user"]
assert len(user_messages) > 0
assert any(
"Hello, this is a test message" in str(msg["response"])
for msg in user_messages
)
async def test_confirm_signal(
self, client: Client, sample_combined_input: CombinedInput
):
"""Test confirm signal handling for tool execution."""
task_queue_name = str(uuid.uuid4())
# Create mock activity functions with proper signatures
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
@activity.defn(name="agent_validatePrompt")
async def mock_agent_validatePrompt(
validation_input: ValidationInput,
) -> ValidationResult:
return ValidationResult(validationResult=True, validationFailedReason={})
@activity.defn(name="agent_toolPlanner")
async def mock_agent_toolPlanner(input: ToolPromptInput) -> dict:
return {
"next": "confirm",
"tool": "TestTool",
"args": {"test_arg": "test_value"},
"response": "Ready to execute tool",
}
@activity.defn(name="TestTool")
async def mock_test_tool(args: dict) -> dict:
return {"result": "Test tool executed successfully"}
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[
mock_get_wf_env_vars,
mock_agent_validatePrompt,
mock_agent_toolPlanner,
mock_test_tool,
],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
sample_combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Send user prompt that will require confirmation
await handle.signal(AgentGoalWorkflow.user_prompt, "Execute the test tool")
# Query to check tool data is set
import asyncio
await asyncio.sleep(0.1) # Give workflow time to process
tool_data = await handle.query(AgentGoalWorkflow.get_latest_tool_data)
if tool_data:
assert tool_data.get("tool") == "TestTool"
assert tool_data.get("next") == "confirm"
# Send confirmation and end chat
await handle.signal(AgentGoalWorkflow.confirm)
await handle.signal(AgentGoalWorkflow.end_chat)
result = await handle.result()
assert isinstance(result, str)
async def test_validation_failure(
self, client: Client, sample_combined_input: CombinedInput
):
"""Test workflow handles validation failures correctly."""
task_queue_name = str(uuid.uuid4())
# Create mock activity functions with proper signatures
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
@activity.defn(name="agent_validatePrompt")
async def mock_agent_validatePrompt(
validation_input: ValidationInput,
) -> ValidationResult:
return ValidationResult(
validationResult=False,
validationFailedReason={
"next": "question",
"response": "Your request doesn't make sense in this context",
},
)
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[mock_get_wf_env_vars, mock_agent_validatePrompt],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
sample_combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Send invalid prompt
await handle.signal(
AgentGoalWorkflow.user_prompt, "Invalid nonsensical prompt"
)
# Give workflow time to process the prompt
import asyncio
await asyncio.sleep(0.2)
# End workflow to check conversation
await handle.signal(AgentGoalWorkflow.end_chat)
result = await handle.result()
# Verify validation failure message was added
import json
try:
conversation_history = json.loads(result.replace("'", '"'))
except Exception:
# Fallback to eval if json fails
conversation_history = eval(result)
messages = conversation_history["messages"]
# Should have validation failure response
agent_messages = [msg for msg in messages if msg["actor"] == "agent"]
assert len(agent_messages) > 0
assert any(
"doesn't make sense" in str(msg["response"]) for msg in agent_messages
)
async def test_conversation_summary_initialization(
self, client: Client, sample_agent_goal
):
"""Test workflow initializes with conversation summary."""
task_queue_name = str(uuid.uuid4())
# Create input with conversation summary
from collections import deque
tool_params = AgentGoalWorkflowParams(
conversation_summary="Previous conversation summary", prompt_queue=deque()
)
combined_input = CombinedInput(
agent_goal=sample_agent_goal, tool_params=tool_params
)
# Create mock activity functions with proper signatures
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[mock_get_wf_env_vars],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Give workflow time to initialize
import asyncio
await asyncio.sleep(0.1)
# Query conversation summary
summary = await handle.query(AgentGoalWorkflow.get_summary_from_history)
assert summary == "Previous conversation summary"
# Query conversation history - should include summary message
conversation_history = await handle.query(
AgentGoalWorkflow.get_conversation_history
)
messages = conversation_history["messages"]
# Should have conversation_summary message
summary_messages = [
msg for msg in messages if msg["actor"] == "conversation_summary"
]
assert len(summary_messages) == 1
assert summary_messages[0]["response"] == "Previous conversation summary"
# End workflow
await handle.signal(AgentGoalWorkflow.end_chat)
await handle.result()
async def test_workflow_queries(
self, client: Client, sample_combined_input: CombinedInput
):
"""Test all workflow query methods."""
task_queue_name = str(uuid.uuid4())
# Create mock activity functions with proper signatures
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[mock_get_wf_env_vars],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
sample_combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Give workflow time to initialize
import asyncio
await asyncio.sleep(0.1)
# Test get_conversation_history query
conversation_history = await handle.query(
AgentGoalWorkflow.get_conversation_history
)
assert isinstance(conversation_history, dict)
assert "messages" in conversation_history
# Test get_agent_goal query
agent_goal = await handle.query(AgentGoalWorkflow.get_agent_goal)
assert agent_goal.id == sample_combined_input.agent_goal.id
# Test get_summary_from_history query
summary = await handle.query(AgentGoalWorkflow.get_summary_from_history)
# Summary might be None if not set, so check for that
if sample_combined_input.tool_params.conversation_summary:
assert summary == sample_combined_input.tool_params.conversation_summary
else:
assert summary is None
# Test get_latest_tool_data query (should be None initially)
tool_data = await handle.query(AgentGoalWorkflow.get_latest_tool_data)
assert tool_data is None
# End workflow
await handle.signal(AgentGoalWorkflow.end_chat)
await handle.result()
async def test_enable_disable_debugging_confirm_signals(
self, client: Client, sample_combined_input: CombinedInput
):
"""Test debugging confirm enable/disable signals."""
task_queue_name = str(uuid.uuid4())
# Create mock activity functions with proper signatures
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[mock_get_wf_env_vars],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
sample_combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Test enable debugging confirm signal
await handle.signal(AgentGoalWorkflow.enable_debugging_confirm)
# Test disable debugging confirm signal
await handle.signal(AgentGoalWorkflow.disable_debugging_confirm)
# End workflow
await handle.signal(AgentGoalWorkflow.end_chat)
result = await handle.result()
assert isinstance(result, str)
async def test_workflow_with_empty_prompt_queue(
self, client: Client, sample_agent_goal
):
"""Test workflow behavior with empty prompt queue."""
task_queue_name = str(uuid.uuid4())
# Create input with empty prompt queue
from collections import deque
tool_params = AgentGoalWorkflowParams(
conversation_summary=None, prompt_queue=deque()
)
combined_input = CombinedInput(
agent_goal=sample_agent_goal, tool_params=tool_params
)
# Create mock activity functions with proper signatures
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[mock_get_wf_env_vars],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Give workflow time to initialize
import asyncio
await asyncio.sleep(0.1)
# Query initial state
conversation_history = await handle.query(
AgentGoalWorkflow.get_conversation_history
)
assert isinstance(conversation_history, dict)
assert "messages" in conversation_history
# Should have no messages initially (empty prompt queue, no summary)
messages = conversation_history["messages"]
assert len(messages) == 0
# End workflow
await handle.signal(AgentGoalWorkflow.end_chat)
result = await handle.result()
assert isinstance(result, str)
async def test_multiple_user_prompts(
self, client: Client, sample_combined_input: CombinedInput
):
"""Test workflow handling multiple user prompts in sequence."""
task_queue_name = str(uuid.uuid4())
# Create mock activity functions with proper signatures
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
@activity.defn(name="agent_validatePrompt")
async def mock_agent_validatePrompt(
validation_input: ValidationInput,
) -> ValidationResult:
return ValidationResult(validationResult=True, validationFailedReason={})
@activity.defn(name="agent_toolPlanner")
async def mock_agent_toolPlanner(input: ToolPromptInput) -> dict:
# Keep workflow running for multiple prompts
return {"next": "question", "response": f"Processed: {input.prompt}"}
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[
mock_get_wf_env_vars,
mock_agent_validatePrompt,
mock_agent_toolPlanner,
],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
sample_combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Send multiple prompts
await handle.signal(AgentGoalWorkflow.user_prompt, "First message")
import asyncio
await asyncio.sleep(0.1)
await handle.signal(AgentGoalWorkflow.user_prompt, "Second message")
await asyncio.sleep(0.1)
await handle.signal(AgentGoalWorkflow.user_prompt, "Third message")
await asyncio.sleep(0.1)
# End workflow
await handle.signal(AgentGoalWorkflow.end_chat)
result = await handle.result()
assert isinstance(result, str)
# Parse result and verify multiple messages
import json
try:
conversation_history = json.loads(result.replace("'", '"'))
except Exception:
conversation_history = eval(result)
messages = conversation_history["messages"]
# Should have at least one user message (timing dependent)
user_messages = [msg for msg in messages if msg["actor"] == "user"]
assert len(user_messages) >= 1
# Verify at least the first message was processed
message_texts = [str(msg["response"]) for msg in user_messages]
assert any("First message" in text for text in message_texts)

View File

@@ -0,0 +1,418 @@
import asyncio
import uuid
from collections import deque
from typing import Sequence
from unittest.mock import patch
import pytest
from temporalio import activity
from temporalio.client import Client
from temporalio.common import RawValue
from temporalio.testing import ActivityEnvironment
from temporalio.worker import Worker
from activities.tool_activities import _convert_args_types, mcp_list_tools
from models.data_types import (
AgentGoalWorkflowParams,
CombinedInput,
EnvLookupInput,
EnvLookupOutput,
ToolPromptInput,
ValidationInput,
ValidationResult,
)
from models.tool_definitions import AgentGoal, MCPServerDefinition, ToolDefinition
from workflows.agent_goal_workflow import AgentGoalWorkflow
from workflows.workflow_helpers import is_mcp_tool
class DummySession:
async def __aenter__(self):
return self
async def __aexit__(self, exc_type, exc, tb):
pass
async def initialize(self):
pass
async def list_tools(self):
class Tool:
def __init__(self, name):
self.name = name
self.description = f"desc {name}"
self.inputSchema = {}
return type(
"Resp", (), {"tools": [Tool("list_products"), Tool("create_customer")]}
)()
def test_convert_args_types_basic():
args = {
"count": "5",
"price": "12.5",
"flag_true": "true",
"flag_false": "false",
"name": "pizza",
"already_int": 2,
}
result = _convert_args_types(args)
assert result["count"] == 5 and isinstance(result["count"], int)
assert result["price"] == 12.5 and isinstance(result["price"], float)
assert result["flag_true"] is True
assert result["flag_false"] is False
assert result["name"] == "pizza"
assert result["already_int"] == 2
def test_is_mcp_tool_identification():
server_def = MCPServerDefinition(name="test", command="python", args=["server.py"])
goal = AgentGoal(
id="g",
category_tag="food",
agent_name="agent",
agent_friendly_description="",
description="",
tools=[ToolDefinition(name="AddToCart", description="", arguments=[])],
starter_prompt="",
example_conversation_history="",
mcp_server_definition=server_def,
)
assert is_mcp_tool("list_products", goal) is True
assert is_mcp_tool("AddToCart", goal) is False
no_mcp_goal = AgentGoal(
id="g2",
category_tag="food",
agent_name="agent",
agent_friendly_description="",
description="",
tools=[],
starter_prompt="",
example_conversation_history="",
mcp_server_definition=None,
)
assert is_mcp_tool("list_products", no_mcp_goal) is False
@pytest.mark.asyncio
async def test_mcp_list_tools_success():
server_def = MCPServerDefinition(name="test", command="python", args=["server.py"])
from contextlib import asynccontextmanager
@asynccontextmanager
async def dummy_connection(command, args, env):
yield None, None
with patch(
"activities.tool_activities._build_connection", return_value={"type": "stdio"}
), patch("activities.tool_activities._stdio_connection", dummy_connection), patch(
"activities.tool_activities.ClientSession", lambda r, w: DummySession()
):
activity_env = ActivityEnvironment()
result = await activity_env.run(mcp_list_tools, server_def, ["list_products"])
assert result["success"] is True
assert result["filtered_count"] == 1
assert "list_products" in result["tools"]
@pytest.mark.asyncio
async def test_mcp_list_tools_failure():
server_def = MCPServerDefinition(name="test", command="python", args=["server.py"])
from contextlib import asynccontextmanager
@asynccontextmanager
async def failing_connection(*args, **kwargs):
raise RuntimeError("conn fail")
yield None, None
with patch(
"activities.tool_activities._build_connection", return_value={"type": "stdio"}
), patch("activities.tool_activities._stdio_connection", failing_connection):
activity_env = ActivityEnvironment()
result = await activity_env.run(mcp_list_tools, server_def)
assert result["success"] is False
assert "conn fail" in result["error"]
@pytest.mark.asyncio
async def test_workflow_loads_mcp_tools_dynamically(client: Client):
"""Workflow should load MCP tools and add them to the goal."""
task_queue_name = str(uuid.uuid4())
server_def = MCPServerDefinition(name="test", command="python", args=["srv.py"])
goal = AgentGoal(
id="g_mcp",
category_tag="food",
agent_name="agent",
agent_friendly_description="",
description="",
tools=[],
starter_prompt="",
example_conversation_history="",
mcp_server_definition=server_def,
)
combined_input = CombinedInput(
agent_goal=goal,
tool_params=AgentGoalWorkflowParams(
conversation_summary=None, prompt_queue=deque()
),
)
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
@activity.defn(name="mcp_list_tools")
async def mock_mcp_list_tools(
server_definition: MCPServerDefinition, include_tools=None
):
return {
"server_name": server_definition.name,
"success": True,
"tools": {
"list_products": {
"name": "list_products",
"description": "",
"inputSchema": {},
},
},
"total_available": 1,
"filtered_count": 1,
}
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[mock_get_wf_env_vars, mock_mcp_list_tools],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
# Wait until the MCP tools have been added
for _ in range(10):
updated_goal = await handle.query(AgentGoalWorkflow.get_agent_goal)
if any(t.name == "list_products" for t in updated_goal.tools):
break
await asyncio.sleep(0.1)
else:
updated_goal = await handle.query(AgentGoalWorkflow.get_agent_goal)
assert any(t.name == "list_products" for t in updated_goal.tools)
await handle.signal(AgentGoalWorkflow.end_chat)
await handle.result()
@pytest.mark.asyncio
async def test_mcp_tool_execution_flow(client: Client):
"""MCP tool execution should pass server_definition to activity."""
task_queue_name = str(uuid.uuid4())
server_def = MCPServerDefinition(name="test", command="python", args=["srv.py"])
goal = AgentGoal(
id="g_mcp_exec",
category_tag="food",
agent_name="agent",
agent_friendly_description="",
description="",
tools=[],
starter_prompt="",
example_conversation_history="",
mcp_server_definition=server_def,
)
combined_input = CombinedInput(
agent_goal=goal,
tool_params=AgentGoalWorkflowParams(
conversation_summary=None, prompt_queue=deque()
),
)
captured: dict = {}
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
@activity.defn(name="agent_validatePrompt")
async def mock_validate(prompt: ValidationInput) -> ValidationResult:
return ValidationResult(validationResult=True, validationFailedReason={})
@activity.defn(name="agent_toolPlanner")
async def mock_planner(input: ToolPromptInput) -> dict:
if "planner_called" not in captured:
captured["planner_called"] = True
return {
"next": "confirm",
"tool": "list_products",
"args": {"limit": "5"},
"response": "Listing products",
}
return {"next": "done", "response": "done"}
@activity.defn(name="mcp_list_tools")
async def mock_mcp_list_tools(
server_definition: MCPServerDefinition, include_tools=None
):
return {
"server_name": server_definition.name,
"success": True,
"tools": {
"list_products": {
"name": "list_products",
"description": "",
"inputSchema": {},
},
},
"total_available": 1,
"filtered_count": 1,
}
@activity.defn(name="dynamic_tool_activity", dynamic=True)
async def mock_dynamic_tool_activity(args: Sequence[RawValue]) -> dict:
payload = activity.payload_converter().from_payload(args[0].payload, dict)
captured["dynamic_args"] = payload
return {"tool": "list_products", "success": True, "content": {"ok": True}}
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[
mock_get_wf_env_vars,
mock_validate,
mock_planner,
mock_mcp_list_tools,
mock_dynamic_tool_activity,
],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
await handle.signal(AgentGoalWorkflow.user_prompt, "show menu")
await asyncio.sleep(0.5)
await handle.signal(AgentGoalWorkflow.confirm)
# Give workflow time to execute the MCP tool and finish
await asyncio.sleep(0.5)
result = await handle.result()
print(result)
assert "dynamic_args" in captured
assert "server_definition" in captured["dynamic_args"]
assert captured["dynamic_args"]["server_definition"]["name"] == server_def.name
@pytest.mark.asyncio
async def test_mcp_tool_failure_recorded(client: Client):
"""Failure of an MCP tool should be recorded in conversation history."""
task_queue_name = str(uuid.uuid4())
server_def = MCPServerDefinition(name="test", command="python", args=["srv.py"])
goal = AgentGoal(
id="g_mcp_fail",
category_tag="food",
agent_name="agent",
agent_friendly_description="",
description="",
tools=[],
starter_prompt="",
example_conversation_history="",
mcp_server_definition=server_def,
)
combined_input = CombinedInput(
agent_goal=goal,
tool_params=AgentGoalWorkflowParams(
conversation_summary=None, prompt_queue=deque()
),
)
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
@activity.defn(name="agent_validatePrompt")
async def mock_validate(prompt: ValidationInput) -> ValidationResult:
return ValidationResult(validationResult=True, validationFailedReason={})
@activity.defn(name="agent_toolPlanner")
async def mock_planner(input: ToolPromptInput) -> dict:
return {
"next": "confirm",
"tool": "list_products",
"args": {},
"response": "Listing products",
}
@activity.defn(name="mcp_list_tools")
async def mock_mcp_list_tools(
server_definition: MCPServerDefinition, include_tools=None
):
return {
"server_name": server_definition.name,
"success": True,
"tools": {
"list_products": {
"name": "list_products",
"description": "",
"inputSchema": {},
},
},
"total_available": 1,
"filtered_count": 1,
}
@activity.defn(name="dynamic_tool_activity", dynamic=True)
async def failing_dynamic_tool(args: Sequence[RawValue]) -> dict:
return {
"tool": "list_products",
"success": False,
"error": "Connection timed out",
}
async with Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[
mock_get_wf_env_vars,
mock_validate,
mock_planner,
mock_mcp_list_tools,
failing_dynamic_tool,
],
):
handle = await client.start_workflow(
AgentGoalWorkflow.run,
combined_input,
id=str(uuid.uuid4()),
task_queue=task_queue_name,
)
await handle.signal(AgentGoalWorkflow.user_prompt, "show menu")
await asyncio.sleep(0.5)
await handle.signal(AgentGoalWorkflow.confirm)
# Give workflow time to record the failure result
await asyncio.sleep(0.5)
await handle.signal(AgentGoalWorkflow.end_chat)
result = await handle.result()
import json
try:
history = json.loads(result.replace("'", '"'))
except Exception:
history = eval(result)
assert any(
msg["actor"] == "tool_result" and not msg["response"].get("success", True)
for msg in history["messages"]
)

View File

@@ -0,0 +1,578 @@
import json
import os
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from temporalio.client import Client
from temporalio.testing import ActivityEnvironment
from activities.tool_activities import (
MCPServerDefinition,
ToolActivities,
dynamic_tool_activity,
)
from models.data_types import (
EnvLookupInput,
EnvLookupOutput,
ToolPromptInput,
ValidationInput,
ValidationResult,
)
class TestToolActivities:
"""Test cases for ToolActivities."""
def setup_method(self):
"""Set up test environment for each test."""
self.tool_activities = ToolActivities()
@pytest.mark.asyncio
async def test_agent_validatePrompt_valid_prompt(
self, sample_agent_goal, sample_conversation_history
):
"""Test agent_validatePrompt with a valid prompt."""
validation_input = ValidationInput(
prompt="I need help with the test tool",
conversation_history=sample_conversation_history,
agent_goal=sample_agent_goal,
)
# Mock the agent_toolPlanner to return a valid response
mock_response = {"validationResult": True, "validationFailedReason": {}}
with patch.object(
self.tool_activities, "agent_toolPlanner", new_callable=AsyncMock
) as mock_planner:
mock_planner.return_value = mock_response
activity_env = ActivityEnvironment()
result = await activity_env.run(
self.tool_activities.agent_validatePrompt, validation_input
)
assert isinstance(result, ValidationResult)
assert result.validationResult is True
assert result.validationFailedReason == {}
# Verify the mock was called with correct parameters
mock_planner.assert_called_once()
@pytest.mark.asyncio
async def test_agent_validatePrompt_invalid_prompt(
self, sample_agent_goal, sample_conversation_history
):
"""Test agent_validatePrompt with an invalid prompt."""
validation_input = ValidationInput(
prompt="asdfghjkl nonsense",
conversation_history=sample_conversation_history,
agent_goal=sample_agent_goal,
)
# Mock the agent_toolPlanner to return an invalid response
mock_response = {
"validationResult": False,
"validationFailedReason": {
"next": "question",
"response": "Your request doesn't make sense in this context",
},
}
with patch.object(
self.tool_activities, "agent_toolPlanner", new_callable=AsyncMock
) as mock_planner:
mock_planner.return_value = mock_response
activity_env = ActivityEnvironment()
result = await activity_env.run(
self.tool_activities.agent_validatePrompt, validation_input
)
assert isinstance(result, ValidationResult)
assert result.validationResult is False
assert "doesn't make sense" in str(result.validationFailedReason)
@pytest.mark.asyncio
async def test_agent_toolPlanner_success(self):
"""Test agent_toolPlanner with successful LLM response."""
prompt_input = ToolPromptInput(
prompt="Test prompt", context_instructions="Test context instructions"
)
# Mock the completion function
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[
0
].message.content = (
'{"next": "confirm", "tool": "TestTool", "response": "Test response"}'
)
with patch("activities.tool_activities.completion") as mock_completion:
mock_completion.return_value = mock_response
activity_env = ActivityEnvironment()
result = await activity_env.run(
self.tool_activities.agent_toolPlanner, prompt_input
)
assert isinstance(result, dict)
assert result["next"] == "confirm"
assert result["tool"] == "TestTool"
assert result["response"] == "Test response"
# Verify completion was called with correct parameters
mock_completion.assert_called_once()
call_args = mock_completion.call_args[1]
assert call_args["model"] == self.tool_activities.llm_model
assert len(call_args["messages"]) == 2
assert call_args["messages"][0]["role"] == "system"
assert call_args["messages"][1]["role"] == "user"
@pytest.mark.asyncio
async def test_agent_toolPlanner_with_custom_base_url(self):
"""Test agent_toolPlanner with custom base URL configuration."""
# Set up tool activities with custom base URL
with patch.dict(os.environ, {"LLM_BASE_URL": "https://custom.endpoint.com"}):
tool_activities = ToolActivities()
prompt_input = ToolPromptInput(
prompt="Test prompt", context_instructions="Test context instructions"
)
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[
0
].message.content = '{"next": "done", "response": "Test"}'
with patch("activities.tool_activities.completion") as mock_completion:
mock_completion.return_value = mock_response
activity_env = ActivityEnvironment()
await activity_env.run(tool_activities.agent_toolPlanner, prompt_input)
# Verify base_url was included in the call
call_args = mock_completion.call_args[1]
assert "base_url" in call_args
assert call_args["base_url"] == "https://custom.endpoint.com"
@pytest.mark.asyncio
async def test_agent_toolPlanner_json_parsing_error(self):
"""Test agent_toolPlanner handles JSON parsing errors."""
prompt_input = ToolPromptInput(
prompt="Test prompt", context_instructions="Test context instructions"
)
# Mock the completion function to return invalid JSON
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[0].message.content = "Invalid JSON response"
with patch("activities.tool_activities.completion") as mock_completion:
mock_completion.return_value = mock_response
activity_env = ActivityEnvironment()
with pytest.raises(Exception): # Should raise JSON parsing error
await activity_env.run(
self.tool_activities.agent_toolPlanner, prompt_input
)
@pytest.mark.asyncio
async def test_get_wf_env_vars_default_values(self):
"""Test get_wf_env_vars with default values."""
env_input = EnvLookupInput(
show_confirm_env_var_name="SHOW_CONFIRM", show_confirm_default=True
)
# Clear environment variables
with patch.dict(os.environ, {}, clear=True):
activity_env = ActivityEnvironment()
result = await activity_env.run(
self.tool_activities.get_wf_env_vars, env_input
)
assert isinstance(result, EnvLookupOutput)
assert result.show_confirm is True # default value
assert result.multi_goal_mode is False # default value (single agent mode)
@pytest.mark.asyncio
async def test_get_wf_env_vars_custom_values(self):
"""Test get_wf_env_vars with custom environment values."""
env_input = EnvLookupInput(
show_confirm_env_var_name="SHOW_CONFIRM", show_confirm_default=True
)
# Set environment variables
with patch.dict(
os.environ, {"SHOW_CONFIRM": "false", "AGENT_GOAL": "specific_goal"}
):
activity_env = ActivityEnvironment()
result = await activity_env.run(
self.tool_activities.get_wf_env_vars, env_input
)
assert isinstance(result, EnvLookupOutput)
assert result.show_confirm is False # from env var
assert result.multi_goal_mode is False # from env var
def test_sanitize_json_response(self):
"""Test JSON response sanitization."""
# Test with markdown code blocks
response_with_markdown = '```json\n{"test": "value"}\n```'
sanitized = self.tool_activities.sanitize_json_response(response_with_markdown)
assert sanitized == '{"test": "value"}'
# Test with extra whitespace
response_with_whitespace = ' \n{"test": "value"} \n'
sanitized = self.tool_activities.sanitize_json_response(
response_with_whitespace
)
assert sanitized == '{"test": "value"}'
def test_parse_json_response_success(self):
"""Test successful JSON parsing."""
json_string = '{"next": "confirm", "tool": "TestTool"}'
result = self.tool_activities.parse_json_response(json_string)
assert isinstance(result, dict)
assert result["next"] == "confirm"
assert result["tool"] == "TestTool"
def test_parse_json_response_failure(self):
"""Test JSON parsing with invalid JSON."""
invalid_json = "Not valid JSON"
with pytest.raises(Exception): # Should raise JSON parsing error
self.tool_activities.parse_json_response(invalid_json)
class TestDynamicToolActivity:
"""Test cases for dynamic_tool_activity."""
@pytest.mark.asyncio
async def test_dynamic_tool_activity_sync_handler(self):
"""Test dynamic tool activity with synchronous handler."""
# Mock the activity info and payload converter
mock_info = MagicMock()
mock_info.activity_type = "TestTool"
mock_payload_converter = MagicMock()
mock_payload = MagicMock()
mock_payload.payload = b'{"test_arg": "test_value"}'
mock_payload_converter.from_payload.return_value = {"test_arg": "test_value"}
# Mock the handler function
def mock_handler(args):
return {"result": f"Handled {args['test_arg']}"}
with patch("temporalio.activity.info", return_value=mock_info), patch(
"temporalio.activity.payload_converter", return_value=mock_payload_converter
), patch("tools.get_handler", return_value=mock_handler):
activity_env = ActivityEnvironment()
result = await activity_env.run(dynamic_tool_activity, [mock_payload])
assert isinstance(result, dict)
assert result["result"] == "Handled test_value"
@pytest.mark.asyncio
async def test_dynamic_tool_activity_async_handler(self):
"""Test dynamic tool activity with asynchronous handler."""
# Mock the activity info and payload converter
mock_info = MagicMock()
mock_info.activity_type = "AsyncTestTool"
mock_payload_converter = MagicMock()
mock_payload = MagicMock()
mock_payload.payload = b'{"test_arg": "async_test"}'
mock_payload_converter.from_payload.return_value = {"test_arg": "async_test"}
# Mock the async handler function
async def mock_async_handler(args):
return {"async_result": f"Async handled {args['test_arg']}"}
with patch("temporalio.activity.info", return_value=mock_info), patch(
"temporalio.activity.payload_converter", return_value=mock_payload_converter
), patch("tools.get_handler", return_value=mock_async_handler):
activity_env = ActivityEnvironment()
result = await activity_env.run(dynamic_tool_activity, [mock_payload])
assert isinstance(result, dict)
assert result["async_result"] == "Async handled async_test"
class TestToolActivitiesIntegration:
"""Integration tests for ToolActivities in a real Temporal environment."""
@pytest.mark.asyncio
async def test_activities_in_worker(self, client: Client):
"""Test activities can be registered and executed in a worker."""
# task_queue_name = str(uuid.uuid4())
tool_activities = ToolActivities()
# Test get_wf_env_vars activity using ActivityEnvironment
env_input = EnvLookupInput(
show_confirm_env_var_name="TEST_CONFIRM", show_confirm_default=False
)
activity_env = ActivityEnvironment()
result = await activity_env.run(tool_activities.get_wf_env_vars, env_input)
assert isinstance(result, EnvLookupOutput)
assert isinstance(result.show_confirm, bool)
assert isinstance(result.multi_goal_mode, bool)
class TestEdgeCases:
"""Test edge cases and error handling."""
def setup_method(self):
"""Set up test environment for each test."""
self.tool_activities = ToolActivities()
@pytest.mark.asyncio
async def test_agent_validatePrompt_with_empty_conversation_history(
self, sample_agent_goal
):
"""Test validation with empty conversation history."""
validation_input = ValidationInput(
prompt="Test prompt",
conversation_history={"messages": []},
agent_goal=sample_agent_goal,
)
mock_response = {"validationResult": True, "validationFailedReason": {}}
with patch.object(
self.tool_activities, "agent_toolPlanner", new_callable=AsyncMock
) as mock_planner:
mock_planner.return_value = mock_response
activity_env = ActivityEnvironment()
result = await activity_env.run(
self.tool_activities.agent_validatePrompt, validation_input
)
assert isinstance(result, ValidationResult)
assert result.validationResult
assert result.validationFailedReason == {}
@pytest.mark.asyncio
async def test_agent_toolPlanner_with_long_prompt(self):
"""Test toolPlanner with very long prompt."""
long_prompt = "This is a very long prompt " * 100
tool_prompt_input = ToolPromptInput(
prompt=long_prompt, context_instructions="Test context instructions"
)
# Mock the completion response
mock_response = MagicMock()
mock_response.choices = [MagicMock()]
mock_response.choices[
0
].message.content = '{"next": "done", "response": "Processed long prompt"}'
with patch("activities.tool_activities.completion", return_value=mock_response):
activity_env = ActivityEnvironment()
result = await activity_env.run(
self.tool_activities.agent_toolPlanner, tool_prompt_input
)
assert isinstance(result, dict)
assert result["next"] == "done"
assert "Processed long prompt" in result["response"]
@pytest.mark.asyncio
async def test_sanitize_json_with_various_formats(self):
"""Test JSON sanitization with various input formats."""
# Test markdown code blocks
markdown_json = '```json\n{"test": "value"}\n```'
result = self.tool_activities.sanitize_json_response(markdown_json)
assert result == '{"test": "value"}'
# Test with extra whitespace
whitespace_json = ' \n {"test": "value"} \n '
result = self.tool_activities.sanitize_json_response(whitespace_json)
assert result == '{"test": "value"}'
# Test already clean JSON
clean_json = '{"test": "value"}'
result = self.tool_activities.sanitize_json_response(clean_json)
assert result == '{"test": "value"}'
@pytest.mark.asyncio
async def test_parse_json_response_with_invalid_json(self):
"""Test JSON parsing with invalid JSON."""
with pytest.raises(json.JSONDecodeError):
self.tool_activities.parse_json_response("Invalid JSON {test: value")
@pytest.mark.asyncio
async def test_get_wf_env_vars_with_various_env_values(self):
"""Test environment variable parsing with different values."""
# Test with "true" string
with patch.dict(os.environ, {"TEST_CONFIRM": "true"}):
env_input = EnvLookupInput(
show_confirm_env_var_name="TEST_CONFIRM", show_confirm_default=False
)
activity_env = ActivityEnvironment()
result = await activity_env.run(
self.tool_activities.get_wf_env_vars, env_input
)
assert result.show_confirm
# Test with "false" string
with patch.dict(os.environ, {"TEST_CONFIRM": "false"}):
env_input = EnvLookupInput(
show_confirm_env_var_name="TEST_CONFIRM", show_confirm_default=True
)
activity_env = ActivityEnvironment()
result = await activity_env.run(
self.tool_activities.get_wf_env_vars, env_input
)
assert not result.show_confirm
# Test with missing env var (should use default)
with patch.dict(os.environ, {}, clear=True):
env_input = EnvLookupInput(
show_confirm_env_var_name="MISSING_VAR", show_confirm_default=True
)
activity_env = ActivityEnvironment()
result = await activity_env.run(
self.tool_activities.get_wf_env_vars, env_input
)
assert result.show_confirm
class TestMCPIntegration:
@pytest.mark.asyncio
async def test_convert_args_types(self):
from activities.tool_activities import _convert_args_types
args = {
"int_val": "123",
"float_val": "123.45",
"bool_true": "true",
"bool_false": "False",
"string": "text",
"other": 5,
}
converted = _convert_args_types(args)
assert converted["int_val"] == 123
assert converted["float_val"] == 123.45
assert converted["bool_true"] is True
assert converted["bool_false"] is False
assert converted["string"] == "text"
assert converted["other"] == 5
@pytest.mark.asyncio
async def test_dynamic_tool_activity_mcp_call(self):
mcp_def = MCPServerDefinition(
name="stripe", command="python", args=["server.py"]
)
payload = MagicMock()
payload.payload = b'{"server_definition": null, "amount": "10", "flag": "true"}'
mock_info = MagicMock()
mock_info.activity_type = "list_products"
from contextlib import asynccontextmanager
@asynccontextmanager
async def dummy_conn(*args, **kwargs):
yield (None, None)
class DummySession:
async def __aenter__(self):
return self
async def __aexit__(self, exc_type, exc, tb):
pass
async def initialize(self):
pass
async def call_tool(self, tool_name, arguments=None):
self.called_tool = tool_name
self.called_args = arguments
return MagicMock(content="ok")
mock_payload_converter = MagicMock()
mock_payload_converter.from_payload.return_value = {
"server_definition": mcp_def,
"amount": "10",
"flag": "true",
}
with patch("activities.tool_activities._stdio_connection", dummy_conn), patch(
"activities.tool_activities.ClientSession", return_value=DummySession()
), patch(
"activities.tool_activities._build_connection",
return_value={
"type": "stdio",
"command": "python",
"args": ["server.py"],
"env": {},
},
), patch(
"temporalio.activity.info", return_value=mock_info
), patch(
"temporalio.activity.payload_converter", return_value=mock_payload_converter
):
result = await ActivityEnvironment().run(dynamic_tool_activity, [payload])
assert result["success"] is True
assert result["tool"] == "list_products"
@pytest.mark.asyncio
async def test_mcp_tool_activity_failure(self):
tool_activities = ToolActivities()
mcp_def = MCPServerDefinition(
name="stripe", command="python", args=["server.py"]
)
async def dummy_conn(*args, **kwargs):
from contextlib import asynccontextmanager
@asynccontextmanager
async def cm():
yield (None, None)
return cm()
class DummySession:
async def __aenter__(self):
return self
async def __aexit__(self, exc_type, exc, tb):
pass
async def initialize(self):
pass
async def call_tool(self, tool_name, arguments=None):
raise TypeError("boom")
with patch("activities.tool_activities._stdio_connection", dummy_conn), patch(
"activities.tool_activities.ClientSession", return_value=DummySession()
), patch(
"activities.tool_activities._build_connection",
return_value={
"type": "stdio",
"command": "python",
"args": ["server.py"],
"env": {},
},
):
result = await ActivityEnvironment().run(
tool_activities.mcp_tool_activity,
"list_products",
{"server_definition": mcp_def, "amount": "10"},
)
assert result["success"] is False
assert result["error_type"] == "TypeError"

View File

@@ -0,0 +1,36 @@
import pytest
from models.tool_definitions import (
AgentGoal,
MCPServerDefinition,
ToolArgument,
ToolDefinition,
)
from workflows.workflow_helpers import is_mcp_tool
def make_goal(with_mcp: bool) -> AgentGoal:
tools = [ToolDefinition(name="AddToCart", description="", arguments=[])]
mcp_def = None
if with_mcp:
mcp_def = MCPServerDefinition(
name="stripe", command="python", args=["server.py"]
)
return AgentGoal(
id="g",
category_tag="test",
agent_name="Test",
agent_friendly_description="",
tools=tools,
mcp_server_definition=mcp_def,
)
def test_is_mcp_tool_recognizes_native():
goal = make_goal(True)
assert not is_mcp_tool("AddToCart", goal)
def test_is_mcp_tool_recognizes_mcp():
goal = make_goal(True)
assert is_mcp_tool("list_products", goal)

View File

@@ -0,0 +1,134 @@
import concurrent.futures
import uuid
from contextlib import contextmanager
from typing import Any, Dict, List, Optional, Sequence
from temporalio import activity
from temporalio.client import Client, WorkflowExecutionStatus
from temporalio.common import RawValue
from temporalio.worker import Worker
from api.main import get_initial_agent_goal
from models.data_types import (
AgentGoalWorkflowParams,
CombinedInput,
EnvLookupInput,
EnvLookupOutput,
ToolPromptInput,
ValidationInput,
ValidationResult,
)
from models.tool_definitions import MCPServerDefinition
from workflows.agent_goal_workflow import AgentGoalWorkflow
@contextmanager
def my_context():
print("Setup")
yield "some_value" # Value assigned to 'as' variable
print("Cleanup")
async def test_flight_booking(client: Client):
# load_dotenv("test_flights_single.env")
with my_context() as value:
print(f"Working with {value}")
# Create the test environment
# env = await WorkflowEnvironment.start_local()
# client = env.client
task_queue_name = str(uuid.uuid4())
workflow_id = str(uuid.uuid4())
# Create mock activity functions with proper signatures
@activity.defn(name="get_wf_env_vars")
async def mock_get_wf_env_vars(input: EnvLookupInput) -> EnvLookupOutput:
return EnvLookupOutput(show_confirm=True, multi_goal_mode=True)
@activity.defn(name="agent_validatePrompt")
async def mock_agent_validatePrompt(
validation_input: ValidationInput,
) -> ValidationResult:
return ValidationResult(validationResult=True, validationFailedReason={})
@activity.defn(name="agent_toolPlanner")
async def mock_agent_toolPlanner(input: ToolPromptInput) -> dict:
return {"next": "done", "response": "Test response from LLM"}
@activity.defn(name="mcp_list_tools")
async def mock_mcp_list_tools(
server_definition: MCPServerDefinition,
include_tools: Optional[List[str]] = None,
) -> Dict[str, Any]:
return {"success": True, "tools": {}, "server_name": "test"}
@activity.defn(name="mcp_tool_activity")
async def mock_mcp_tool_activity(
tool_name: str, tool_args: Dict[str, Any]
) -> Dict[str, Any]:
return {"success": True, "result": "Mock MCP tool result"}
@activity.defn(name="dynamic_tool_activity", dynamic=True)
async def mock_dynamic_tool_activity(args: Sequence[RawValue]) -> dict:
return {"success": True, "result": "Mock dynamic tool result"}
with concurrent.futures.ThreadPoolExecutor(
max_workers=100
) as activity_executor:
worker = Worker(
client,
task_queue=task_queue_name,
workflows=[AgentGoalWorkflow],
activities=[
mock_get_wf_env_vars,
mock_agent_validatePrompt,
mock_agent_toolPlanner,
mock_mcp_list_tools,
mock_mcp_tool_activity,
mock_dynamic_tool_activity,
],
activity_executor=activity_executor,
)
async with worker:
initial_agent_goal = get_initial_agent_goal()
# Create combined input
combined_input = CombinedInput(
tool_params=AgentGoalWorkflowParams(None, None),
agent_goal=initial_agent_goal,
)
prompt = "Hello!"
# async with Worker(client, task_queue=task_queue_name, workflows=[AgentGoalWorkflow], activities=[ToolActivities.agent_validatePrompt, ToolActivities.agent_toolPlanner, dynamic_tool_activity]):
# todo set goal categories for scenarios
handle = await client.start_workflow(
AgentGoalWorkflow.run,
combined_input,
id=workflow_id,
task_queue=task_queue_name,
start_signal="user_prompt",
start_signal_args=[prompt],
)
# todo send signals to simulate user input
# await handle.signal(AgentGoalWorkflow.user_prompt, "book flights") # for multi-goal
await handle.signal(
AgentGoalWorkflow.user_prompt, "sydney in september"
)
assert (
WorkflowExecutionStatus.RUNNING == (await handle.describe()).status
)
# assert ["Hello, user1", "Hello, user2"] == await handle.result()
await handle.signal(
AgentGoalWorkflow.user_prompt, "I'm all set, end conversation"
)
# assert WorkflowExecutionStatus.COMPLETED == (await handle.describe()).status
result = await handle.result()
print(f"Workflow result: {result}")
# todo dump workflow history for analysis optional
# todo assert result is good

View File

@@ -1,9 +1,9 @@
from http.server import HTTPServer, BaseHTTPRequestHandler
from urllib.parse import parse_qs, urlparse
import json
import time
import random
import string
import time
from http.server import BaseHTTPRequestHandler, HTTPServer
from urllib.parse import parse_qs, urlparse
def parse_datetime(datetime_str):
@@ -213,4 +213,4 @@ def run_server():
if __name__ == "__main__":
run_server()
run_server()

View File

@@ -1,9 +1,25 @@
from .change_goal import change_goal
from .create_invoice import create_invoice
from .ecommerce.get_order import get_order
from .ecommerce.list_orders import list_orders
from .ecommerce.track_package import track_package
from .fin.check_account_valid import check_account_valid
from .fin.get_account_balances import get_account_balance
from .fin.move_money import move_money
from .fin.submit_loan_application import submit_loan_application
from .find_events import find_events
from .food.add_to_cart import add_to_cart
from .give_hint import give_hint
from .guess_location import guess_location
from .hr.book_pto import book_pto
from .hr.checkpaybankstatus import checkpaybankstatus
from .hr.current_pto import current_pto
from .hr.future_pto_calc import future_pto_calc
from .list_agents import list_agents
from .search_fixtures import search_fixtures
from .search_flights import search_flights
from .search_trains import search_trains
from .search_trains import book_trains
from .create_invoice import create_invoice
from .find_events import find_events
from .search_trains import book_trains, search_trains
from .transfer_control import transfer_control
def get_handler(tool_name: str):
@@ -19,5 +35,39 @@ def get_handler(tool_name: str):
return create_invoice
if tool_name == "FindEvents":
return find_events
if tool_name == "ListAgents":
return list_agents
if tool_name == "ChangeGoal":
return change_goal
if tool_name == "TransferControl":
return transfer_control
if tool_name == "CurrentPTO":
return current_pto
if tool_name == "BookPTO":
return book_pto
if tool_name == "FuturePTOCalc":
return future_pto_calc
if tool_name == "CheckPayBankStatus":
return checkpaybankstatus
if tool_name == "FinCheckAccountIsValid":
return check_account_valid
if tool_name == "FinCheckAccountBalance":
return get_account_balance
if tool_name == "FinMoveMoney":
return move_money
if tool_name == "FinCheckAccountSubmitLoanApproval":
return submit_loan_application
if tool_name == "GetOrder":
return get_order
if tool_name == "TrackPackage":
return track_package
if tool_name == "ListOrders":
return list_orders
if tool_name == "GiveHint":
return give_hint
if tool_name == "GuessLocation":
return guess_location
if tool_name == "AddToCart":
return add_to_cart
raise ValueError(f"Unknown tool: {tool_name}")

8
tools/change_goal.py Normal file
View File

@@ -0,0 +1,8 @@
def change_goal(args: dict) -> dict:
new_goal = args.get("goalID")
if new_goal is None:
new_goal = "goal_choose_agent_type"
return {
"new_goal": new_goal,
}

View File

@@ -1,16 +1,14 @@
import os
import stripe
from dotenv import load_dotenv
load_dotenv(override=True) # Load environment variables from a .env file
stripe.api_key = os.getenv("STRIPE_API_KEY", "YOUR_DEFAULT_KEY")
def ensure_customer_exists(
customer_id: str = None, email: str = "default@example.com"
) -> str:
"""Ensure a Stripe customer exists; create one if not."""
import stripe
if customer_id:
try:
stripe.Customer.retrieve(customer_id)
@@ -26,50 +24,52 @@ def ensure_customer_exists(
def create_invoice(args: dict) -> dict:
"""Create and finalize a Stripe invoice."""
# Find or create customer
customer_id = ensure_customer_exists(
args.get("customer_id"), args.get("email", "default@example.com")
)
import stripe
# Get amount and convert to cents
amount = args.get("amount", 200.00) # Default to $200.00
try:
amount_cents = int(float(amount) * 100)
except (TypeError, ValueError):
return {"error": "Invalid amount provided. Please confirm the amount."}
# Load environment variables and configure stripe
load_dotenv(override=True)
stripe.api_key = os.getenv("STRIPE_API_KEY")
# Create an invoice item
stripe.InvoiceItem.create(
customer=customer_id,
amount=amount_cents,
currency="gbp",
description=args.get("tripDetails", "Service Invoice"),
)
# If an API key exists in the env file, find or create customer
if stripe.api_key is not None and stripe.api_key != "":
customer_id = ensure_customer_exists(
args.get("customer_id"), args.get("email", "default@example.com")
)
# Create and finalize the invoice
invoice = stripe.Invoice.create(
customer=customer_id,
collection_method="send_invoice", # Invoice is sent to the customer
days_until_due=args.get("days_until_due", 7), # Default due date: 7 days
pending_invoice_items_behavior="include", # No pending invoice items
)
finalized_invoice = stripe.Invoice.finalize_invoice(invoice.id)
# Get amount and convert to cents
amount = args.get("amount", 200.00) # Default to $200.00
try:
amount_cents = int(float(amount) * 100)
except (TypeError, ValueError):
return {"error": "Invalid amount provided. Please confirm the amount."}
return {
"invoiceStatus": finalized_invoice.status,
"invoiceURL": finalized_invoice.hosted_invoice_url,
"reference": finalized_invoice.number,
}
# Create an invoice item
stripe.InvoiceItem.create(
customer=customer_id,
amount=amount_cents,
currency="gbp",
description=args.get("tripDetails", "Service Invoice"),
)
# Create and finalize the invoice
invoice = stripe.Invoice.create(
customer=customer_id,
collection_method="send_invoice", # Invoice is sent to the customer
days_until_due=args.get("days_until_due", 7), # Default due date: 7 days
pending_invoice_items_behavior="include", # No pending invoice items
)
finalized_invoice = stripe.Invoice.finalize_invoice(invoice.id)
def create_invoice_example(args: dict) -> dict:
"""
This is an example implementation of the CreateInvoice tool
Doesn't call any external services, just returns a dummy response
"""
print("[CreateInvoice] Creating invoice with:", args)
return {
"invoiceStatus": "generated",
"invoiceURL": "https://pay.example.com/invoice/12345",
"reference": "INV-12345",
}
return {
"invoiceStatus": finalized_invoice.status,
"invoiceURL": finalized_invoice.hosted_invoice_url,
"reference": finalized_invoice.number,
}
# if no API key is in the env file, return dummy info
else:
print("[CreateInvoice] Creating invoice with:", args)
return {
"invoiceStatus": "generated",
"invoiceURL": "https://pay.example.com/invoice/12345",
"reference": "INV-12345",
}

View File

@@ -0,0 +1,58 @@
{
"accounts": [
{
"name": "Matt Murdock",
"email": "matt.murdock@nelsonmurdock.com",
"account_id": "11235",
"checking_balance": "1275.4",
"savings_balance": "2800.15",
"bitcoin_balance": 0.1378,
"account_creation_date": "2014-03-10"
},
{
"name": "Foggy Nelson",
"email": "foggy.nelson@nelsonmurdock.com",
"account_id": "112358",
"checking_balance": "1523.66",
"savings_balance": "4875.89",
"bitcoin_balance": 0.0923,
"account_creation_date": "2014-03-10"
},
{
"name": "Karen Page",
"email": "karen.page@nelsonmurdock.com",
"account_id": "112",
"checking_balance": 645.25,
"savings_balance": "830.5",
"bitcoin_balance": 0.0456,
"account_creation_date": "2015-01-15"
},
{
"name": "Wilson Fisk",
"email": "wilson.fisk@fiskcorp.com",
"account_id": "11",
"checking_balance": 25000.0,
"savings_balance": 150000.75,
"bitcoin_balance": 5987.6721,
"account_creation_date": "2013-09-20"
},
{
"name": "Frank Castle",
"email": "frank.castle@vigilante.net",
"account_id": "1",
"checking_balance": 320.1,
"savings_balance": 0.3,
"bitcoin_balance": 15.2189,
"account_creation_date": "2016-02-05"
},
{
"name": "Joshua Smith",
"email": "joshmsmith@gmail.com",
"account_id": "11235813",
"checking_balance": 3021.9,
"savings_balance": 500.5,
"bitcoin_balance": 0.001,
"account_creation_date": "2020-03-19"
}
]
}

View File

@@ -0,0 +1,81 @@
{
"orders": [
{
"id": "100",
"summary": "Lawyer Books",
"email": "matt.murdock@nelsonmurdock.com",
"status": "cancelled",
"order_date": "2025-03-30",
"last_update": "2025-04-01"
},
{
"id": "101",
"summary": "Bonking Sticks",
"email": "matt.murdock@nelsonmurdock.com",
"status": "paid",
"order_date": "2025-04-01",
"last_order_update": "2025-04-01"
},
{
"id": "102",
"summary": "Red Sunglasses",
"email": "matt.murdock@nelsonmurdock.com",
"status": "shipped",
"order_date": "2025-04-01",
"last_order_update": "2025-04-01",
"tracking_id": "UPS67890"
},
{
"id": "200",
"summary": "Paper",
"email": "foggy.nelson@nelsonmurdock.com",
"status": "shipped",
"order_date": "2025-04-03",
"last_update": "2025-04-06",
"tracking_id": "USPS12345"
},
{
"id": "300",
"summary": "Chemistry Books",
"email": "heisenberg@blue-meth.com",
"status": "shipped",
"order_date": "2025-03-30",
"last_update": "2025-04-06",
"tracking_id": "USPS12345"
},
{
"id": "301",
"summary": "Book: Being a Cool Bro",
"email": "heisenberg@blue-meth.com",
"status": "cancelled",
"order_date": "2025-04-01",
"last_update": "2025-04-02"
},
{
"id": "302",
"summary": "Black Hat",
"email": "heisenberg@blue-meth.com",
"status": "delivered",
"order_date": "2025-04-01",
"last_update": "2025-04-06",
"tracking_id": "UPS67890"
},
{
"id": "400",
"summary": "Giant Graphic Hoodie",
"email": "jessenotpinkman@blue-meth.com",
"status": "shipped",
"order_date": "2025-04-03",
"last_update": "2025-04-09",
"tracking_id": "UPS67890"
},
{
"id": "401",
"summary": "Giant Pants",
"email": "jessenotpinkman@blue-meth.com",
"status": "processing",
"order_date": "2025-04-03",
"last_update": "2025-04-09"
}
]
}

View File

@@ -0,0 +1,158 @@
{
"packages": [
{
"TrackingNumber": "USPS12345",
"Delivered": false,
"Carrier": "USPS",
"ServiceType": "USPS Ground Advantage<SUP>&#153;</SUP>",
"PickupDate": "",
"ScheduledDeliveryDate": "April 14, 2025",
"ScheduledDeliveryDateInDateTimeFromat": "2025-04-14T00:00:00",
"StatusCode": "In Transit from Origin Processing",
"Status": "Departed Post Office",
"StatusSummary": "Your item has left our acceptance facility and is in transit to a sorting facility on April 10, 2025 at 7:06 am in IRON RIDGE, WI 53035.",
"Message": "",
"DeliveredDateTime": "",
"DeliveredDateTimeInDateTimeFormat": null,
"SignatureName": "",
"DestinationCity": "CITY",
"DestinationState": "ST",
"DestinationZip": "12345",
"DestinationCountry": null,
"EventDate": "2025-04-10T07:06:00",
"TrackingDetails": [
{
"EventDateTime": "April 10, 2025 7:06 am",
"Event": "Departed Post Office",
"EventAddress": "IRON RIDGE WI 53035",
"State": "WI",
"City": "IRON RIDGE",
"Zip": "53035",
"EventDateTimeInDateTimeFormat": "2025-04-10T07:06:00"
},
{
"EventDateTime": "April 9, 2025 11:29 am",
"Event": "USPS picked up item",
"EventAddress": "IRON RIDGE WI 53035",
"State": "WI",
"City": "IRON RIDGE",
"Zip": "53035",
"EventDateTimeInDateTimeFormat": "2025-04-09T11:29:00"
},
{
"EventDateTime": "April 7, 2025 6:29 am",
"Event": "Shipping Label Created, USPS Awaiting Item",
"EventAddress": "IRON RIDGE WI 53035",
"State": "WI",
"City": "IRON RIDGE",
"Zip": "53035",
"EventDateTimeInDateTimeFormat": "2025-04-07T06:29:00"
}
]
},
{
"TrackingNumber": "UPS67890",
"Delivered": true,
"Carrier": "UPS",
"ServiceType": "UPS Ground Saver®",
"PickupDate": "",
"ScheduledDeliveryDate": "",
"ScheduledDeliveryDateInDateTimeFromat": null,
"StatusCode": "D",
"Status": "DELIVERED",
"StatusSummary": "DELIVERED",
"Message": "",
"DeliveredDateTime": "20250415 154315",
"DeliveredDateTimeInDateTimeFormat": "2025-04-15T15:43:15",
"SignatureName": "",
"DestinationCity": "CHICAGO",
"DestinationState": "IL",
"DestinationZip": "",
"DestinationCountry": "US",
"EventDate": "2025-04-15T15:43:15",
"TrackingDetails": [
{
"EventDateTime": "20250415 154315",
"Event": "DELIVERED ",
"EventAddress": "CHICAGO IL US",
"State": "IL",
"City": "CHICAGO",
"Zip": null,
"EventDateTimeInDateTimeFormat": "2025-04-15T15:43:15"
},
{
"EventDateTime": "20250415 090938",
"Event": "Out For Delivery Today",
"EventAddress": "Chicago IL US",
"State": "IL",
"City": "Chicago",
"Zip": null,
"EventDateTimeInDateTimeFormat": "2025-04-15T09:09:38"
},
{
"EventDateTime": "20250415 074141",
"Event": "Loaded on Delivery Vehicle ",
"EventAddress": "Chicago IL US",
"State": "IL",
"City": "Chicago",
"Zip": null,
"EventDateTimeInDateTimeFormat": "2025-04-15T07:41:41"
},
{
"EventDateTime": "20250415 032200",
"Event": "Arrived at Facility",
"EventAddress": "Chicago IL US",
"State": "IL",
"City": "Chicago",
"Zip": null,
"EventDateTimeInDateTimeFormat": "2025-04-15T03:22:00"
},
{
"EventDateTime": "20250414 223000",
"Event": "Departed from Facility",
"EventAddress": "Hodgkins IL US",
"State": "IL",
"City": "Hodgkins",
"Zip": null,
"EventDateTimeInDateTimeFormat": "2025-04-14T22:30:00"
},
{
"EventDateTime": "20250414 002700",
"Event": "Arrived at Facility",
"EventAddress": "Hodgkins IL US",
"State": "IL",
"City": "Hodgkins",
"Zip": null,
"EventDateTimeInDateTimeFormat": "2025-04-14T00:27:00"
},
{
"EventDateTime": "20250410 211700",
"Event": "Departed from Facility",
"EventAddress": "Las Vegas NV US",
"State": "NV",
"City": "Las Vegas",
"Zip": null,
"EventDateTimeInDateTimeFormat": "2025-04-10T21:17:00"
},
{
"EventDateTime": "20250410 132625",
"Event": "Arrived at Facility",
"EventAddress": "Las Vegas NV US",
"State": "NV",
"City": "Las Vegas",
"Zip": null,
"EventDateTimeInDateTimeFormat": "2025-04-10T13:26:25"
},
{
"EventDateTime": "20250409 100659",
"Event": "Shipper created a label, UPS has not received the package yet. ",
"EventAddress": " US",
"State": null,
"City": null,
"Zip": null,
"EventDateTimeInDateTimeFormat": "2025-04-09T10:06:59"
}
]
}
]
}

View File

@@ -0,0 +1,27 @@
{
"theCompany": {
"weLove": "theCompany",
"employees": [
{
"email": "josh.smith@temporal.io",
"currentPTOHrs": 400,
"hrsAddedPerMonth": 8
},
{
"email": "laine@awesome.com",
"currentPTOHrs": 40,
"hrsAddedPerMonth": 12
},
{
"email": "steve.this.is.for.you@gmail.com",
"currentPTOHrs": 4000,
"hrsAddedPerMonth": 20
},
{
"email": "your_email_here@yourcompany.com",
"currentPTOHrs": 150,
"hrsAddedPerMonth": 19
}
]
}
}

View File

@@ -0,0 +1,25 @@
import json
from pathlib import Path
# this is made to demonstrate functionality but it could just as durably be an API call
# called as part of a temporal activity with automatic retries
def get_order(args: dict) -> dict:
order_id = args.get("order_id")
file_path = (
Path(__file__).resolve().parent.parent / "data" / "customer_order_data.json"
)
if not file_path.exists():
return {"error": "Data file not found."}
with open(file_path, "r") as file:
data = json.load(file)
order_list = data["orders"]
for order in order_list:
if order["id"] == order_id:
return order
return_msg = "Order " + order_id + " not found."
return {"error": return_msg}

View File

@@ -0,0 +1,32 @@
import json
from pathlib import Path
def sorting(e):
return e["order_date"]
def list_orders(args: dict) -> dict:
email_address = args.get("email_address")
file_path = (
Path(__file__).resolve().parent.parent / "data" / "customer_order_data.json"
)
if not file_path.exists():
return {"error": "Data file not found."}
with open(file_path, "r") as file:
data = json.load(file)
order_list = data["orders"]
rtn_order_list = []
for order in order_list:
if order["email"] == email_address:
rtn_order_list.append(order)
if len(rtn_order_list) > 0:
rtn_order_list.sort(key=sorting)
return {"orders": rtn_order_list}
else:
return_msg = "No orders for customer " + email_address + " found."
return {"error": return_msg}

View File

@@ -0,0 +1,161 @@
import http
import json
import os
from pathlib import Path
# Send back dummy data in the correct format - to use the real API, 1) change this to be track_package_fake and 2) change the below track_package_real to be track_package
def track_package(args: dict) -> dict:
tracking_id = args.get("tracking_id")
file_path = (
Path(__file__).resolve().parent.parent / "data" / "dummy_tracking_data.json"
)
if not file_path.exists():
return {"error": "Data file not found."}
with open(file_path, "r") as file:
data = json.load(file)
package_list = data["packages"]
for package in package_list:
if package["TrackingNumber"] == tracking_id:
scheduled_delivery_date = package["ScheduledDeliveryDate"]
carrier = package["Carrier"]
status_summary = package["StatusSummary"]
tracking_details = package.get("TrackingDetails", [])
last_tracking_update = ""
if (
tracking_details
and tracking_details is not None
and tracking_details[0] is not None
):
last_tracking_update = tracking_details[0][
"EventDateTimeInDateTimeFormat"
]
tracking_link = ""
if carrier == "USPS":
tracking_link = f"https://tools.usps.com/go/TrackConfirmAction?qtc_tLabels1={tracking_id}"
elif carrier == "UPS":
tracking_link = (
f"https://www.ups.com/track?track=yes&trackNums={tracking_id}"
)
return {
"scheduled_delivery_date": scheduled_delivery_date,
"carrier": carrier,
"status_summary": status_summary,
"tracking_link": tracking_link,
"last_tracking_update": last_tracking_update,
}
return_msg = "Package not found with tracking info " + tracking_id
return {"error": return_msg}
"""Format of response:
{
"TrackingNumber": "",
"Delivered": false,
"Carrier": "USPS",
"ServiceType": "USPS Ground Advantage<SUP>&#153;</SUP>",
"PickupDate": "",
"ScheduledDeliveryDate": "April 14, 2025",
"ScheduledDeliveryDateInDateTimeFromat": "2025-04-14T00:00:00",
"StatusCode": "In Transit from Origin Processing",
"Status": "Departed Post Office",
"StatusSummary": "Your item has left our acceptance facility and is in transit to a sorting facility on April 10, 2025 at 7:06 am in IRON RIDGE, WI 53035.",
"Message": "",
"DeliveredDateTime": "",
"DeliveredDateTimeInDateTimeFormat": null,
"SignatureName": "",
"DestinationCity": "CITY",
"DestinationState": "ST",
"DestinationZip": "12345",
"DestinationCountry": null,
"EventDate": "2025-04-10T07:06:00",
"TrackingDetails": [
{
"EventDateTime": "April 10, 2025 7:06 am",
"Event": "Departed Post Office",
"EventAddress": "IRON RIDGE WI 53035",
"State": "WI",
"City": "IRON RIDGE",
"Zip": "53035",
"EventDateTimeInDateTimeFormat": "2025-04-10T07:06:00"
},
{
"EventDateTime": "April 9, 2025 11:29 am",
"Event": "USPS picked up item",
"EventAddress": "IRON RIDGE WI 53035",
"State": "WI",
"City": "IRON RIDGE",
"Zip": "53035",
"EventDateTimeInDateTimeFormat": "2025-04-09T11:29:00"
},
{
"EventDateTime": "April 7, 2025 6:29 am",
"Event": "Shipping Label Created, USPS Awaiting Item",
"EventAddress": "IRON RIDGE WI 53035",
"State": "WI",
"City": "IRON RIDGE",
"Zip": "53035",
"EventDateTimeInDateTimeFormat": "2025-04-07T06:29:00"
}
]
}
"""
def track_package_real(args: dict) -> dict:
tracking_id = args.get("tracking_id")
api_key = os.getenv("RAPIDAPI_KEY")
api_host = os.getenv("RAPIDAPI_HOST_PACKAGE", "trackingpackage.p.rapidapi.com")
conn = http.client.HTTPSConnection(api_host)
headers = {
"x-rapidapi-key": api_key,
"x-rapidapi-host": api_host,
"Authorization": "Basic Ym9sZGNoYXQ6TGZYfm0zY2d1QzkuKz9SLw==",
}
path = f"/TrackingPackage?trackingNumber={tracking_id}"
conn.request("GET", path, headers=headers)
res = conn.getresponse()
data = res.read()
data_decoded = data.decode("utf-8")
conn.close()
try:
json_data = json.loads(data_decoded)
except json.JSONDecodeError:
return {"error": "Invalid JSON response"}
scheduled_delivery_date = json_data["ScheduledDeliveryDate"]
carrier = json_data["Carrier"]
status_summary = json_data["StatusSummary"]
tracking_details = json_data.get("TrackingDetails", [])
last_tracking_update = ""
if (
tracking_details
and tracking_details is not None
and tracking_details[0] is not None
):
last_tracking_update = tracking_details[0]["EventDateTimeInDateTimeFormat"]
tracking_link = ""
if carrier == "USPS":
tracking_link = (
f"https://tools.usps.com/go/TrackConfirmAction?qtc_tLabels1={tracking_id}"
)
elif carrier == "UPS":
tracking_link = f"https://www.ups.com/track?track=yes&trackNums={tracking_id}"
return {
"scheduled_delivery_date": scheduled_delivery_date,
"carrier": carrier,
"status_summary": status_summary,
"tracking_link": tracking_link,
"last_tracking_update": last_tracking_update,
}

View File

@@ -0,0 +1,31 @@
import json
from pathlib import Path
# this is made to demonstrate functionality but it could just as durably be an API call
# called as part of a temporal activity with automatic retries
def check_account_valid(args: dict) -> dict:
email = args.get("email")
account_id = args.get("account_id")
file_path = (
Path(__file__).resolve().parent.parent / "data" / "customer_account_data.json"
)
if not file_path.exists():
return {"error": "Data file not found."}
with open(file_path, "r") as file:
data = json.load(file)
account_list = data["accounts"]
for account in account_list:
if account["email"] == email or account["account_id"] == account_id:
return {"status": "account valid"}
return_msg = (
"Account not found with email address "
+ email
+ " or account ID: "
+ account_id
)
return {"error": return_msg}

View File

@@ -0,0 +1,33 @@
import json
from pathlib import Path
# this is made to demonstrate functionality but it could just as durably be an API call
# this assumes it's a valid account - use check_account_valid() to verify that first
def get_account_balance(args: dict) -> dict:
account_key = args.get("email_address_or_account_ID")
file_path = (
Path(__file__).resolve().parent.parent / "data" / "customer_account_data.json"
)
if not file_path.exists():
return {"error": "Data file not found."}
with open(file_path, "r") as file:
data = json.load(file)
account_list = data["accounts"]
for account in account_list:
if account["email"] == account_key or account["account_id"] == account_key:
return {
"name": account["name"],
"email": account["email"],
"account_id": account["account_id"],
"checking_balance": account["checking_balance"],
"savings_balance": account["savings_balance"],
"bitcoin_balance": account["bitcoin_balance"],
"account_creation_date": account["account_creation_date"],
}
return_msg = "Account not found with for " + account_key
return {"error": return_msg}

151
tools/fin/move_money.py Normal file
View File

@@ -0,0 +1,151 @@
import json
import os
from dataclasses import dataclass
from pathlib import Path
from temporalio.exceptions import WorkflowAlreadyStartedError
from shared.config import get_temporal_client
# enums for the java enum
# class ExecutionScenarios(Enum):
# HAPPY_PATH = 0
# ADVANCED_VISIBILITY = auto() # 1
# HUMAN_IN_LOOP = auto() # 2
# API_DOWNTIME = auto() # 3
# BUG_IN_WORKFLOW = auto() # 4
# INVALID_ACCOUNT = auto() # 5
# these dataclasses are for calling the Temporal Workflow
# Python equivalent of the workflow we're calling's Java WorkflowParameterObj
@dataclass
class MoneyMovementWorkflowParameterObj:
amount: int # Using snake_case as per Python conventions
scenario: str
# this is made to demonstrate functionality but it could just as durably be an API call
# this assumes it's a valid account - use check_account_valid() to verify that first
async def move_money(args: dict) -> dict:
account_key = args.get("email_address_or_account_ID")
account_type: str = args.get("accounttype")
amount = args.get("amount")
destinationaccount = args.get("destinationaccount")
file_path = (
Path(__file__).resolve().parent.parent / "data" / "customer_account_data.json"
)
if not file_path.exists():
return {"error": "Data file not found."}
with open(file_path, "r") as file:
data = json.load(file)
account_list = data["accounts"]
for account in account_list:
if account["email"] == account_key or account["account_id"] == account_key:
amount_str: str = str(amount)
from_account_combo = account_key + account_type
transfer_workflow_id = await start_workflow(
amount_cents=str_dollars_to_cents(amount_str),
from_account_name=from_account_combo,
to_account_name=destinationaccount,
)
if account_type.casefold() == "checking":
from_key = "checking_balance"
elif account_type.casefold() == "savings":
from_key = "savings_balance"
else:
return_msg = "Money order for account types other than checking or savings is not implemented."
return {"error": return_msg}
to_key = (
"savings_balance"
if destinationaccount.casefold() == "savings"
else "checking_balance"
)
# Update from-account balance
from_balance = float(str_dollars_to_cents(str(account[from_key])))
from_balance -= float(str_dollars_to_cents(amount_str))
account[from_key] = str(from_balance / 100)
# Update destination-account balance
to_balance = float(str_dollars_to_cents(str(account[to_key])))
to_balance += float(str_dollars_to_cents(amount_str))
account[to_key] = str(to_balance / 100)
with open(file_path, "w") as file:
json.dump(data, file, indent=4)
return {
"status": "money movement complete",
"confirmation id": transfer_workflow_id,
"new_balance": account[from_key],
"destination_balance": account[to_key],
}
return_msg = "Account not found with for " + account_key
return {"error": return_msg}
# Async function to start workflow
async def start_workflow(
amount_cents: int, from_account_name: str, to_account_name: str
) -> str:
start_real_workflow = os.getenv("FIN_START_REAL_WORKFLOW")
if start_real_workflow is not None and start_real_workflow.lower() == "false":
START_REAL_WORKFLOW = False
else:
START_REAL_WORKFLOW = True
if START_REAL_WORKFLOW:
# Connect to Temporal
client = await get_temporal_client()
# Create the parameter object
params = MoneyMovementWorkflowParameterObj(
amount=amount_cents, scenario="HAPPY_PATH"
)
workflow_id = (
"TRANSFER-ACCT-" + from_account_name + "-TO-" + to_account_name
) # business-relevant workflow ID
try:
handle = await client.start_workflow(
"moneyTransferWorkflow", # Workflow name
params, # Workflow parameters
id=workflow_id,
task_queue="MoneyTransferJava", # Task queue name
)
return handle.id
except WorkflowAlreadyStartedError:
existing_handle = client.get_workflow_handle(workflow_id=workflow_id)
return existing_handle.id
else:
return (
"TRANSFER-ACCT-" + from_account_name + "-TO-" + to_account_name + "not-real"
)
# cleans a string dollar amount description to cents value
def str_dollars_to_cents(dollar_str: str) -> int:
try:
# Remove '$' and any whitespace
cleaned_str = dollar_str.replace("$", "").strip()
# Handle empty string or invalid input
if not cleaned_str:
raise ValueError("Empty amount provided")
# Convert to float and then to cents
amount = float(cleaned_str)
if amount < 0:
raise ValueError("Negative amounts not allowed")
return int(amount * 100)
except ValueError as e:
raise ValueError(f"Invalid dollar amount format: {dollar_str}") from e

View File

@@ -0,0 +1,117 @@
import os
from dataclasses import dataclass
from datetime import date
from temporalio import common
from temporalio.client import WithStartWorkflowOperation, WorkflowUpdateFailedError
from shared.config import get_temporal_client
# Define data structures to match the Java workflow's expected input/output
# see https://github.com/temporal-sa/temporal-latency-optimization-scenarios for more details
@dataclass
class TransactionRequest:
amount: float
sourceAccount: str
targetAccount: str
@dataclass
class TxResult:
transactionId: str
status: str
# demonstrate starting a workflow and early return pattern while the workflow continues
async def submit_loan_application(args: dict) -> dict:
account_key = args.get("email_address_or_account_ID")
amount = args.get("amount")
loan_status: dict = await start_workflow(amount=amount, account_name=account_key)
if loan_status.get("error") is None:
return {
"status": loan_status.get("loan_application_status"),
"detailed_status": loan_status.get("application_details"),
"next_step": loan_status.get("advisement"),
"confirmation_id": loan_status.get("transaction_id"),
}
else:
print(loan_status)
return loan_status
# Async function to start workflow
async def start_workflow(
amount: str,
account_name: str,
) -> dict:
start_real_workflow = os.getenv("FIN_START_REAL_WORKFLOW")
if start_real_workflow is not None and start_real_workflow.lower() == "false":
# START_REAL_WORKFLOW = False
return {
"loan_application_status": "applied",
"application_details": "loan application is submitted and initial validation is complete",
"transaction_id": "APPLICATION" + account_name,
"advisement": "You'll receive a confirmation for final approval in three business days",
}
else:
# START_REAL_WORKFLOW = True
# Connect to Temporal
client = await get_temporal_client()
# Define the workflow ID and task queue
workflow_id = (
"LOAN_APPLICATION-" + account_name + "-" + date.today().strftime("%Y-%m-%d")
)
task_queue = "LatencyOptimizationTEST"
# Create a TransactionRequest (matching the Java workflow's expected input)
tx_request = TransactionRequest(
amount=float(amount),
targetAccount=account_name,
sourceAccount=account_name,
)
start_op = WithStartWorkflowOperation(
"TransactionWorkflowLocalBeforeUpdate",
tx_request,
id=workflow_id,
id_conflict_policy=common.WorkflowIDConflictPolicy.USE_EXISTING,
task_queue=task_queue,
)
try:
print("trying update-with-start")
tx_result = TxResult(
await client.execute_update_with_start_workflow(
"returnInitResult",
start_workflow_operation=start_op,
)
)
except WorkflowUpdateFailedError:
print("aww man got exception WorkflowUpdateFailedError")
tx_result = None
return_msg = "Loan could not be processed for " + account_name
return {"error": return_msg}
workflow_handle = await start_op.workflow_handle()
print(f"Workflow started with ID: {workflow_handle.id}")
print(tx_result)
print(
f"Update result: Transaction ID = {tx_result.transactionId}, Message = {tx_result.status}"
)
# Optionally, wait for the workflow to complete and get the final result
# final_result = await handle.result()
# print(f"Workflow completed with result: {final_result}")
# return {'status': loan_status.get("loan_status"), 'detailed_status': loan_status.get("results"), 'next_step': loan_status.get("advisement"), 'confirmation_id': loan_status.get("workflowID")}
return {
"loan_application_status": "applied",
"application_details": "loan application is submitted and initial validation is complete",
"transaction_id": tx_result.transactionId,
"advisement": "You'll receive a confirmation for final approval in three business days",
}

View File

@@ -1,6 +1,6 @@
import json
from datetime import datetime
from pathlib import Path
import json
def find_events(args: dict) -> dict:

33
tools/food/add_to_cart.py Normal file
View File

@@ -0,0 +1,33 @@
def add_to_cart(args: dict) -> dict:
"""
Simple stateless cart tool for demo purposes.
In production, this would use proper session storage or database.
"""
customer_email = args.get("customer_email")
item_name = args.get("item_name")
item_price = float(args.get("item_price", 0))
quantity = int(args.get("quantity", 1))
stripe_product_id = args.get("stripe_product_id")
# Basic validation
if not customer_email:
return {"error": "Customer email is required"}
if not item_name:
return {"error": "Item name is required"}
if item_price <= 0:
return {"error": "Item price must be greater than 0"}
if quantity <= 0:
return {"error": "Quantity must be greater than 0"}
# For demo purposes, just acknowledge the addition
# In a real system, this would store to session/database
return {
"status": "success",
"message": f"Added {quantity} x {item_name} (${item_price}) to cart for {customer_email}",
"item_added": {
"name": item_name,
"price": item_price,
"quantity": quantity,
"stripe_product_id": stripe_product_id,
},
}

View File

@@ -0,0 +1,49 @@
import os
from dotenv import load_dotenv
def delete_food_ordering_products():
"""Archive all Stripe products with metadata use_case = food_ordering_demo (since products with prices cannot be deleted)."""
import stripe
# Load environment variables and configure stripe
load_dotenv(override=True)
stripe.api_key = os.getenv("STRIPE_API_KEY")
if not stripe.api_key:
print("Error: STRIPE_API_KEY not found in environment variables")
return
try:
# Search for products with food_ordering_demo use_case
products = stripe.Product.search(
query="metadata['use_case']:'food_ordering_demo'", limit=100
)
if not products.data:
print("No products found with use_case = food_ordering_demo")
return
archived_count = 0
for product in products.data:
try:
# Archive the product (set active=False)
stripe.Product.modify(product.id, active=False)
print(f"Archived product: {product.name} (ID: {product.id})")
archived_count += 1
except Exception as e:
print(
f"Error archiving product {product.name} (ID: {product.id}): {str(e)}"
)
print(f"\nSuccessfully archived {archived_count} products")
except Exception as e:
print(f"Error searching for products: {str(e)}")
if __name__ == "__main__":
delete_food_ordering_products()

View File

@@ -0,0 +1,93 @@
import json
import os
from dotenv import load_dotenv
def create_stripe_products():
"""Create Stripe products and prices from the stripe_pizza_products.json file."""
import stripe
# Load environment variables and configure stripe
load_dotenv(override=True)
stripe.api_key = os.getenv("STRIPE_API_KEY")
if not stripe.api_key:
print("Error: STRIPE_API_KEY not found in environment variables")
return
# Load the products data
current_dir = os.path.dirname(__file__)
products_file = os.path.join(current_dir, "stripe_pizza_products.json")
with open(products_file, "r") as f:
products_data = json.load(f)
# Filter for food ordering demo products only
food_products = [
p
for p in products_data
if p.get("metadata", {}).get("use_case") == "food_ordering_demo"
]
created_products = []
for product_data in food_products:
try:
# Create the product with relevant fields
product = stripe.Product.create(
name=product_data["name"],
description=product_data.get("description"),
images=product_data.get("images", []),
metadata=product_data.get("metadata", {}),
type=product_data.get("type", "service"),
active=product_data.get("active", True),
)
# Create price for the product if price_info exists
price_info = product_data.get("price_info")
if price_info:
price_amount = price_info.get("amount")
currency = price_info.get("currency", "usd")
price = stripe.Price.create(
currency=currency, unit_amount=price_amount, product=product.id
)
# Set this price as the default price for the product
stripe.Product.modify(product.id, default_price=price.id)
print(
f"Created product: {product.name} (ID: {product.id}) with default price ${price_amount/100:.2f}"
)
created_products.append(
{
"name": product.name,
"id": product.id,
"price_id": price.id,
"price_amount": price_amount,
"original_id": product_data["id"],
}
)
else:
print(
f"Created product: {product.name} (ID: {product.id}) - No price defined"
)
created_products.append(
{
"name": product.name,
"id": product.id,
"original_id": product_data["id"],
}
)
except Exception as e:
print(f"Error creating product {product_data['name']}: {str(e)}")
print(f"\nSuccessfully created {len(created_products)} products with prices")
return created_products
if __name__ == "__main__":
create_stripe_products()

View File

@@ -0,0 +1,188 @@
[
{
"id": "prod_SSWirxxS5A8gcT",
"object": "product",
"active": true,
"attributes": [],
"created": 1749360061,
"default_price": "price_1RXbfGKVZbzw7QA57Mj1akGI",
"description": "A large size bottle of cola.",
"images": [
"https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfbDJxckJKMDRnT1dDc253OHlZNWNkZkY5006Xg07kHT"
],
"livemode": false,
"marketing_features": [],
"metadata": {
"use_case": "food_ordering_demo"
},
"name": "Soda",
"price_info": {
"amount": 349,
"currency": "usd"
},
"package_dimensions": null,
"shippable": null,
"statement_descriptor": null,
"tax_code": null,
"type": "service",
"unit_label": null,
"updated": 1749360062,
"url": null
},
{
"id": "prod_SSWhxv3tUy1YOG",
"object": "product",
"active": true,
"attributes": [],
"created": 1749359978,
"default_price": "price_1RXbdvKVZbzw7QA5ARomQvaf",
"description": "Our warm, crusty bread is generously spread with a savory garlic butter and toasted to golden perfection. It's the ideal aromatic and flavorful side to accompany your main course.",
"images": [
"https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfWTdIZTBkUjNZNFQ1ZEhSVG9nRnduY1pS00XVgLRRZD"
],
"livemode": false,
"marketing_features": [],
"metadata": {
"use_case": "food_ordering_demo"
},
"name": "Garlic Bread",
"price_info": {
"amount": 799,
"currency": "usd"
},
"package_dimensions": null,
"shippable": null,
"statement_descriptor": null,
"tax_code": null,
"type": "service",
"unit_label": null,
"updated": 1749360084,
"url": null
},
{
"id": "prod_SSWgXa5bwUFCJs",
"object": "product",
"active": true,
"attributes": [],
"created": 1749359922,
"default_price": "price_1RXbd0KVZbzw7QA5Nq36vdLW",
"description": "A tribute to Italian simplicity, this pizza is topped with fresh mozzarella, a vibrant tomato sauce, and fragrant basil leaves. Each bite delivers a clean and authentic taste of Italy's most famous flavors.",
"images": [
"https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfamdmTXBFbzY0TW9rS2N0c2g0Tml2SERL00Evl60Ttq"
],
"livemode": false,
"marketing_features": [],
"metadata": {
"use_case": "food_ordering_demo"
},
"name": "Margherita Pizza",
"price_info": {
"amount": 1699,
"currency": "usd"
},
"package_dimensions": null,
"shippable": null,
"statement_descriptor": null,
"tax_code": null,
"type": "service",
"unit_label": null,
"updated": 1749359998,
"url": null
},
{
"id": "prod_SSWf738UqIJzzi",
"object": "product",
"active": true,
"attributes": [],
"created": 1749359845,
"default_price": "price_1RXbbmKVZbzw7QA53EkjV2nB",
"description": "A timeless classic featuring a generous layer of savory pepperoni over rich tomato sauce and melted mozzarella cheese. It's the perfect choice for those who love a bold, meaty flavor on a perfectly baked crust.",
"images": [
"https://files.stripe.com/links/MDB8YWNjdF8xTkJPTHVLVlpienc3UUE1fGZsX3Rlc3RfcGRHc0c4cEZYWmR2bm0zOHBOa0FWMk5t008QmCJoWr"
],
"livemode": false,
"marketing_features": [],
"metadata": {
"use_case": "food_ordering_demo"
},
"name": "Pepperoni Pizza",
"price_info": {
"amount": 2299,
"currency": "usd"
},
"package_dimensions": null,
"shippable": null,
"statement_descriptor": null,
"tax_code": null,
"type": "service",
"unit_label": null,
"updated": 1749359846,
"url": null
},
{
"id": "prod_SGMXBnatLlkJ4d",
"object": "product",
"active": true,
"attributes": [],
"created": 1746554502,
"default_price": "price_1RLpoJKVZbzw7QA5ra76Fk6g",
"description": null,
"images": [],
"livemode": false,
"marketing_features": [],
"metadata": {},
"name": "ACME Scooter Token",
"package_dimensions": null,
"shippable": null,
"statement_descriptor": null,
"tax_code": null,
"type": "service",
"unit_label": null,
"updated": 1746554503,
"url": null
},
{
"id": "prod_NxJPcqTWzXk45K",
"object": "product",
"active": true,
"attributes": [],
"created": 1684961969,
"default_price": null,
"description": "$12/Month subscription",
"images": [],
"livemode": false,
"marketing_features": [],
"metadata": {},
"name": "Starter Subscription",
"package_dimensions": null,
"shippable": null,
"statement_descriptor": null,
"tax_code": null,
"type": "service",
"unit_label": null,
"updated": 1684961969,
"url": null
},
{
"id": "prod_NxJ4KvyENd0uUu",
"object": "product",
"active": true,
"attributes": [],
"created": 1684960731,
"default_price": null,
"description": "Created with the Stripe CLI",
"images": [],
"livemode": false,
"marketing_features": [],
"metadata": {},
"name": "Temporal Money Transfer",
"package_dimensions": null,
"shippable": null,
"statement_descriptor": null,
"tax_code": null,
"type": "service",
"unit_label": null,
"updated": 1684960731,
"url": null
}
]

39
tools/give_hint.py Normal file
View File

@@ -0,0 +1,39 @@
TREASURE_LOCATION = {
"address": "300 Lenora",
"city": "Seattle",
"state_full": "Washington",
"state_abbrev": "WA",
"zip": "98121",
"country": "USA",
}
HINTS = [
"country of " + TREASURE_LOCATION["country"],
"state of " + TREASURE_LOCATION["state_full"],
"city of " + TREASURE_LOCATION["city"],
"at a company HQ",
"The company's tech traces its roots to a project called Cadence", # thanks, Grok
"The company offers a tool that lets developers write code as if it's running forever, no matter what crashes", # thanks, Grok
]
''' Additional Grok provided hints about Temporal:
"This company was founded by two engineers who previously worked on a system named after a South American river at Uber."
"Their platform is all about orchestrating workflows that can survive failures—like a conductor keeping the music going."
"They offer a tool that lets developers write code as if its running forever, no matter what crashes."
"Their mission is tied to making distributed systems feel as simple as writing a single app."
"Theyve got a knack for durability—both in their software and their growing reputation."
"This outfit spun out of experiences at AWS and Uber, blending cloud and ride-sharing know-how."
"Their open-source framework has a community thats ticking along, fixing bugs and adding features daily."
"Theyre backed by big venture capital names like Sequoia, betting on their vision for reliable software."
"The companys name might remind you of a word for something fleeting, yet their tech is built to last."'''
def give_hint(args: dict) -> dict:
hint_total = args.get("hint_total")
if hint_total is None:
hint_total = 0
index = hint_total % len(HINTS)
hint_text = HINTS[index]
hint_total = hint_total + 1
return {"hint_number": hint_total, "hint": hint_text}

22
tools/guess_location.py Normal file
View File

@@ -0,0 +1,22 @@
from .give_hint import TREASURE_LOCATION
def guess_location(args: dict) -> dict:
guess_address = args.get("address").lower()
guess_city = args.get("city").lower()
guess_state = args.get("state").lower()
if len(guess_state) == 2:
compare_state = TREASURE_LOCATION.get("state_abbrev").lower()
else:
compare_state = TREASURE_LOCATION.get("state_full").lower()
# Check for the street address to be included in the guess to account for "st" vs "street" or leaving Street off entirely
if (
TREASURE_LOCATION.get("address").lower() in guess_address
and TREASURE_LOCATION.get("city").lower() == guess_city
and compare_state == guess_state
):
return {"treasure_found": "True"}
else:
return {"treasure_found": "False"}

10
tools/hr/book_pto.py Normal file
View File

@@ -0,0 +1,10 @@
def book_pto(args: dict) -> dict:
email = args.get("email")
start_date = args.get("start_date")
end_date = args.get("end_date")
print(
f"[BookPTO] Totally would send an email confirmation of PTO from {start_date} to {end_date} to {email} here!"
)
return {"status": "success"}

View File

@@ -0,0 +1,10 @@
def checkpaybankstatus(args: dict) -> dict:
email = args.get("email")
if email == "grinch@grinch.com":
print("THE GRINCH IS FOUND!")
return {"status": "no money for the grinch"}
# could do logic here or look up data but for now everyone but the grinch is getting paid
return_msg = "connected"
return {"status": return_msg}

27
tools/hr/current_pto.py Normal file
View File

@@ -0,0 +1,27 @@
import json
from pathlib import Path
def current_pto(args: dict) -> dict:
email = args.get("email")
file_path = (
Path(__file__).resolve().parent.parent / "data" / "employee_pto_data.json"
)
if not file_path.exists():
return {"error": "Data file not found."}
data = json.load(open(file_path))
employee_list = data["theCompany"]["employees"]
for employee in employee_list:
if employee["email"] == email:
num_hours = int(employee["currentPTOHrs"])
num_days = float(num_hours / 8)
return {
"num_hours": num_hours,
"num_days": num_days,
}
return_msg = "Employee not found with email address " + email
return {"error": return_msg}

View File

@@ -0,0 +1,78 @@
import json
from datetime import date, datetime
from pathlib import Path
import pandas
from dateutil.relativedelta import relativedelta
def future_pto_calc(args: dict) -> dict:
file_path = (
Path(__file__).resolve().parent.parent / "data" / "employee_pto_data.json"
)
if not file_path.exists():
return {"error": "Data file not found."}
start_date = datetime.strptime(args.get("start_date"), "%Y-%m-%d").date()
end_date = datetime.strptime(args.get("end_date"), "%Y-%m-%d").date()
email = args.get("email")
# Next, set up the ability to calculate how much PTO will be added to the user's total by the start of the PTO request
today = date.today()
if today > start_date:
return_msg = (
"PTO start date " + args.get("start_date") + "cannot be in the past"
)
return {"error": return_msg}
if end_date < start_date:
return_msg = (
"PTO end date "
+ args.get("end_date")
+ " must be after PTO start date "
+ args.get("start_date")
)
return {"error": return_msg}
# Get the number of business days, and then business hours (assume 8 hr biz day), included in the PTO request
biz_days_of_request = len(
pandas.bdate_range(start=start_date, end=end_date, inclusive="both")
)
if biz_days_of_request == 0:
return_msg = (
"There are no business days between "
+ args.get("start_date")
+ " and "
+ args.get("end_date")
)
return {"error": return_msg}
biz_hours_of_request = biz_days_of_request * 8
# Assume PTO is added on the first of every month - month math compares rolling dates, so compare the PTO request with the first day of the current month.
today_first_of_month = date(today.year, today.month, 1)
time_difference = relativedelta(start_date, today_first_of_month)
months_to_accrue = time_difference.years * 12 + time_difference.months
data = json.load(open(file_path))
employee_list = data["theCompany"]["employees"]
enough_pto = False
for employee in employee_list:
if employee["email"] == email:
current_pto_hours = int(employee["currentPTOHrs"])
hrs_added_per_month = int(employee["hrsAddedPerMonth"])
pto_available_at_start = current_pto_hours + (
months_to_accrue * hrs_added_per_month
)
pto_hrs_remaining_after = pto_available_at_start - biz_hours_of_request
if pto_hrs_remaining_after >= 0:
enough_pto = True
return {
"enough_pto": enough_pto,
"pto_hrs_remaining_after": str(pto_hrs_remaining_after),
}
return_msg = "Employee not found with email address " + email
return {"error": return_msg}

43
tools/list_agents.py Normal file
View File

@@ -0,0 +1,43 @@
import os
import goals
def list_agents(args: dict) -> dict:
goal_categories_start = os.getenv("GOAL_CATEGORIES")
if goal_categories_start is None:
goal_categories = ["all"] # default to 'all' categories
else:
goal_categories_start.strip().lower() # handle extra spaces or non-lowercase
goal_categories = goal_categories_start.split(",")
# if multi-goal-mode, add agent_selection as a goal (defaults to True)
if "agent_selection" not in goal_categories:
first_goal_value = os.getenv("AGENT_GOAL")
if (
first_goal_value is None
or first_goal_value.lower() == "goal_choose_agent_type"
):
goal_categories.append("agent_selection")
# always show goals labeled as "system," like the goal chooser
if "system" not in goal_categories:
goal_categories.append("system")
agents = []
if goals.goal_list is not None:
for goal in goals.goal_list:
# add to list if either
# - all
# - current goal's tag is in goal_categories
if "all" in goal_categories or goal.category_tag in goal_categories:
agents.append(
{
"agent_name": goal.agent_name,
"goal_id": goal.id,
"agent_description": goal.agent_friendly_description,
}
)
return {
"agents": agents,
}

View File

@@ -1,64 +1,264 @@
import os
import random
from datetime import date, datetime, timedelta
import requests
from datetime import datetime, timedelta
from dotenv import load_dotenv
PREMIER_LEAGUE_CLUBS_DATA = [
{"name": "Arsenal FC", "stadium": "Emirates Stadium"},
{"name": "Aston Villa FC", "stadium": "Villa Park"},
{"name": "AFC Bournemouth", "stadium": "Vitality Stadium"},
{"name": "Brentford FC", "stadium": "Gtech Community Stadium"},
{"name": "Brighton & Hove Albion FC", "stadium": "American Express Stadium"},
{"name": "Chelsea FC", "stadium": "Stamford Bridge"},
{"name": "Crystal Palace FC", "stadium": "Selhurst Park"},
{"name": "Everton FC", "stadium": "Goodison Park"},
{"name": "Fulham FC", "stadium": "Craven Cottage"},
{"name": "Ipswich Town FC", "stadium": "Portman Road"},
{"name": "Leicester City FC", "stadium": "King Power Stadium"},
{"name": "Liverpool FC", "stadium": "Anfield"},
{"name": "Manchester City FC", "stadium": "Etihad Stadium"},
{"name": "Manchester United FC", "stadium": "Old Trafford"},
{"name": "Newcastle United FC", "stadium": "St James' Park"},
{"name": "Nottingham Forest FC", "stadium": "City Ground"},
{"name": "Southampton FC", "stadium": "St Mary's Stadium"},
{"name": "Tottenham Hotspur FC", "stadium": "Tottenham Hotspur Stadium"},
{"name": "West Ham United FC", "stadium": "London Stadium"},
{"name": "Wolverhampton Wanderers FC", "stadium": "Molineux Stadium"},
]
def get_future_matches(
team_name: str,
all_clubs_data: list,
num_matches: int = 12,
date_from: date = None,
date_to: date = None,
) -> list:
"""Generate a set of future Premier League matches for ``team_name``.
This is a purely mocked schedule. It returns up to ``num_matches``
fixtures, respecting the ``date_from`` and ``date_to`` constraints.
Matches are typically on Saturdays or Sundays.
"""
matches = []
team_details = next((c for c in all_clubs_data if c["name"] == team_name), None)
if not team_details:
return []
opponents_pool = [c for c in all_clubs_data if c["name"] != team_name]
if not opponents_pool:
return []
# Determine the maximum number of matches we can generate based on opponents
# and the requested num_matches
num_actual_matches_to_generate = min(num_matches, len(opponents_pool))
if num_actual_matches_to_generate == 0:
return []
# Shuffle opponents once and pick them sequentially
random.shuffle(opponents_pool) # Shuffle in place
# Determine the initial Saturday for match week consideration
today_date = date.today()
# Default to next Saturday
current_match_week_saturday = today_date + timedelta(
days=(5 - today_date.weekday() + 7) % 7
)
# If today is Saturday and it's late evening, or if today is Sunday,
# advance to the following Saturday.
now_time = datetime.now().time()
if (
today_date.weekday() == 5
and now_time > datetime.strptime("20:00", "%H:%M").time()
) or (today_date.weekday() == 6):
current_match_week_saturday += timedelta(days=7)
# If date_from is specified, ensure our starting Saturday is not before it.
if date_from:
if current_match_week_saturday < date_from:
current_match_week_saturday = date_from
# Align current_match_week_saturday to be a Saturday on or after the potentially adjusted date
current_match_week_saturday += timedelta(
days=(5 - current_match_week_saturday.weekday() + 7) % 7
)
opponent_idx = 0
while len(matches) < num_actual_matches_to_generate and opponent_idx < len(
opponents_pool
):
# If the current week's Saturday is already past date_to, stop.
if date_to and current_match_week_saturday > date_to:
break
opponent_details = opponents_pool[opponent_idx]
is_saturday_game = random.choice([True, True, False])
actual_match_date = None
kick_off_time = ""
if is_saturday_game:
actual_match_date = current_match_week_saturday
kick_off_time = random.choice(["12:30", "15:00", "17:30"])
else: # Sunday game
actual_match_date = current_match_week_saturday + timedelta(days=1)
kick_off_time = random.choice(["14:00", "16:30"])
# Check if this specific match date is within the date_to constraint
if date_to and actual_match_date > date_to:
# If this game is too late, try the next week if possible.
# (This mainly affects Sunday games if Saturday was the last valid day)
current_match_week_saturday += timedelta(days=7)
continue # Skip adding this match, try next week.
match_datetime_gmt = (
f"{actual_match_date.strftime('%Y-%m-%d')} {kick_off_time} GMT"
)
is_home_match = random.choice([True, False])
if is_home_match:
team1_name = team_details["name"]
team2_name = opponent_details["name"]
stadium_name = team_details["stadium"]
else:
team1_name = opponent_details["name"]
team2_name = team_details["name"]
stadium_name = opponent_details["stadium"]
matches.append(
{
"team1": team1_name,
"team2": team2_name,
"stadium": stadium_name,
"datetime_gmt": match_datetime_gmt,
}
)
opponent_idx += 1
current_match_week_saturday += timedelta(
days=7
) # Advance to next week's Saturday
return matches
BASE_URL = "https://api.football-data.org/v4"
def search_fixtures(args: dict) -> dict:
load_dotenv(override=True)
api_key = os.getenv("FOOTBALL_DATA_API_KEY", "YOUR_DEFAULT_KEY")
api_key = os.getenv("FOOTBALL_DATA_API_KEY")
team_name = args.get("team")
date_from_str = args.get("date_from")
date_to_str = args.get("date_to")
headers = {"X-Auth-Token": api_key}
team_name = team_name.lower()
try:
date_from = datetime.strptime(date_from_str, "%Y-%m-%d")
date_to = datetime.strptime(date_to_str, "%Y-%m-%d")
except ValueError:
if not team_name:
return {"error": "Team name is required."}
parsed_date_from = None
if date_from_str:
try:
parsed_date_from = datetime.strptime(date_from_str, "%Y-%m-%d").date()
except ValueError:
return {
"error": f"Invalid date_from: '{date_from_str}'. Expected format YYYY-MM-DD."
}
parsed_date_to = None
if date_to_str:
try:
parsed_date_to = datetime.strptime(date_to_str, "%Y-%m-%d").date()
except ValueError:
return {
"error": f"Invalid date_to: '{date_to_str}'. Expected format YYYY-MM-DD."
}
if parsed_date_from and parsed_date_to and parsed_date_from > parsed_date_to:
return {"error": "date_from cannot be after date_to."}
# If no API key, fall back to mocked data
if not api_key:
# Use the parsed date objects (which can be None)
fixtures = get_future_matches(
team_name,
PREMIER_LEAGUE_CLUBS_DATA,
date_from=parsed_date_from,
date_to=parsed_date_to,
# num_matches can be passed explicitly if needed, otherwise defaults to 12
)
if not fixtures:
# Check if the team name itself was invalid, as get_future_matches returns [] for that too
team_details_check = next(
(c for c in PREMIER_LEAGUE_CLUBS_DATA if c["name"] == team_name), None
)
if not team_details_check:
return {"error": f"Team '{team_name}' not found in mocked data."}
# If team is valid, an empty fixtures list means no matches fit the criteria (e.g., date range)
return {"fixtures": fixtures}
# API Key is present, proceed with API logic
# The API requires both date_from and date_to
if not parsed_date_from or not parsed_date_to:
return {
"error": "Invalid date provided. Expected format YYYY-MM-DD for both date_from and date_to."
"error": "Both date_from and date_to (YYYY-MM-DD) are required for API search."
}
headers = {"X-Auth-Token": api_key}
# For API calls, team name matching might be case-insensitive or require specific handling
# The existing logic uses team_name.lower() for the API search path later.
# Fetch team ID
teams_response = requests.get(f"{BASE_URL}/competitions/PL/teams", headers=headers)
if teams_response.status_code != 200:
return {"error": "Failed to fetch teams data."}
return {
"error": f"Failed to fetch teams data from API (status {teams_response.status_code})."
}
teams_data = teams_response.json()
team_id = None
for team in teams_data["teams"]:
if team_name in team["name"].lower():
team_id = team["id"]
# Using lower() for comparison, assuming API team names might have varied casing
# or the input team_name might not be exact.
# The `ToolDefinition` lists exact names, so direct match might also be an option.
for team_api_data in teams_data.get("teams", []):
if team_name.lower() in team_api_data.get("name", "").lower():
team_id = team_api_data["id"]
break
if not team_id:
return {"error": "Team not found."}
return {"error": f"Team '{team_name}' not found via API."}
date_from_formatted = date_from.strftime("%Y-%m-%d")
date_to_formatted = date_to.strftime("%Y-%m-%d")
date_from_formatted = parsed_date_from.strftime("%Y-%m-%d")
date_to_formatted = parsed_date_to.strftime("%Y-%m-%d")
fixtures_url = f"{BASE_URL}/teams/{team_id}/matches?dateFrom={date_from_formatted}&dateTo={date_to_formatted}"
print(fixtures_url)
# print(fixtures_url) # Keep for debugging if necessary
fixtures_response = requests.get(fixtures_url, headers=headers)
if fixtures_response.status_code != 200:
return {"error": "Failed to fetch fixtures data."}
return {
"error": f"Failed to fetch fixtures data from API (status {fixtures_response.status_code})."
}
fixtures_data = fixtures_response.json()
matching_fixtures = []
for match in fixtures_data.get("matches", []):
match_datetime = datetime.strptime(match["utcDate"], "%Y-%m-%dT%H:%M:%SZ")
if match["competition"]["code"] == "PL":
# Ensure match datetime parsing is robust
try:
match_datetime_utc = datetime.strptime(
match["utcDate"], "%Y-%m-%dT%H:%M:%SZ"
)
except (ValueError, TypeError):
# Skip malformed match entries or log an error
continue
if match.get("competition", {}).get("code") == "PL":
matching_fixtures.append(
{
"date": match_datetime.strftime("%Y-%m-%d"),
"homeTeam": match["homeTeam"]["name"],
"awayTeam": match["awayTeam"]["name"],
"date": match_datetime_utc.strftime("%Y-%m-%d"),
"homeTeam": match.get("homeTeam", {}).get("name", "N/A"),
"awayTeam": match.get("awayTeam", {}).get("name", "N/A"),
}
)
@@ -82,34 +282,69 @@ def search_fixtures_example(args: dict) -> dict:
# Validate dates
try:
date_from = datetime.strptime(date_from_str, "%Y-%m-%d")
date_to = datetime.strptime(date_to_str, "%Y-%m-%d")
# Ensure date strings are not None before parsing
if date_from_str is None or date_to_str is None:
raise ValueError("Date strings cannot be None")
date_from_obj = datetime.strptime(date_from_str, "%Y-%m-%d")
date_to_obj = datetime.strptime(date_to_str, "%Y-%m-%d")
except ValueError:
return {
"error": "Invalid date provided. Expected format YYYY-MM-DD for both date_from and date_to."
}
# Calculate 3 reasonable fixture dates within the given range
date_range = (date_to - date_from).days
date_range = (date_to_obj - date_from_obj).days
if date_range < 0: # date_from is after date_to
return {"fixtures": []} # No fixtures possible
fixture_dates_timestamps = []
if date_range < 21:
# If range is less than 3 weeks, use evenly spaced fixtures
fixture_dates = [
date_from + timedelta(days=max(1, date_range // 3)),
date_from + timedelta(days=max(2, date_range * 2 // 3)),
date_to - timedelta(days=min(2, date_range // 4)),
]
# If range is less than 3 weeks, use evenly spaced fixtures if possible
if date_range >= 2: # Need at least some gap for 3 fixtures
fixture_dates_timestamps = [
date_from_obj
+ timedelta(days=max(0, date_range // 4)), # Closer to start
date_from_obj + timedelta(days=max(1, date_range // 2)), # Middle
date_to_obj - timedelta(days=max(0, date_range // 4)), # Closer to end
]
elif date_range == 1: # Only two days
fixture_dates_timestamps = [date_from_obj, date_to_obj]
elif date_range == 0: # Only one day
fixture_dates_timestamps = [date_from_obj]
else: # date_range is negative, handled above, or 0 (single day)
fixture_dates_timestamps = [date_from_obj] if date_range == 0 else []
else:
# Otherwise space them out by weeks
fixture_dates = [
date_from + timedelta(days=7),
date_from + timedelta(days=14),
date_to - timedelta(days=7),
]
# Otherwise space them out by weeks, ensuring they are within the bounds
d1 = date_from_obj + timedelta(days=7)
d2 = date_from_obj + timedelta(days=14)
d3 = date_to_obj - timedelta(days=7) # Potential third game from the end
# Ensure we only have 3 dates
fixture_dates = fixture_dates[:3]
fixture_dates_timestamps.append(d1)
if d2 <= date_to_obj and d2 > d1: # ensure d2 is valid and distinct
fixture_dates_timestamps.append(d2)
if (
d3 >= date_from_obj and d3 > d2 and d3 <= date_to_obj
): # ensure d3 is valid and distinct
fixture_dates_timestamps.append(d3)
elif (
d3 < date_from_obj and len(fixture_dates_timestamps) < 3
): # if d3 is too early, try using date_to_obj itself if distinct
if date_to_obj not in fixture_dates_timestamps:
fixture_dates_timestamps.append(date_to_obj)
# Ensure unique dates and sort, then take up to 3.
fixture_dates_timestamps = sorted(
list(
set(
f_date
for f_date in fixture_dates_timestamps
if date_from_obj <= f_date <= date_to_obj
)
)
)
fixture_dates_final = fixture_dates_timestamps[:3]
# Expanded pool of opponent teams to avoid team playing against itself
all_opponents = [
"Manchester United FC",
"Leicester City FC",
@@ -120,35 +355,35 @@ def search_fixtures_example(args: dict) -> dict:
"Tottenham Hotspur FC",
"West Ham United FC",
"Everton FC",
"Generic Opponent A",
"Generic Opponent B",
"Generic Opponent C", # Fallbacks
]
# Select opponents that aren't the same as the requested team
available_opponents = [
team for team in all_opponents if team.lower() != team_name.lower()
]
# Ensure we have at least 3 opponents
if len(available_opponents) < 3:
# Add generic opponents if needed
additional_teams = [f"Opponent {i} FC" for i in range(1, 4)]
available_opponents.extend(additional_teams)
# Ensure we have enough opponents for the number of fixtures we'll generate
if len(available_opponents) < len(fixture_dates_final):
needed = len(fixture_dates_final) - len(available_opponents)
for i in range(needed):
available_opponents.append(f"Placeholder Opponent {i+1}")
# Take only the first 3 opponents
opponents = available_opponents[:3]
opponents = available_opponents[: len(fixture_dates_final)]
# Generate fixtures - always exactly 3
fixtures = []
for i, fixture_date in enumerate(fixture_dates):
date_str = fixture_date.strftime("%Y-%m-%d")
# Alternate between home and away games
if i % 2 == 0:
fixtures.append(
{"date": date_str, "homeTeam": opponents[i], "awayTeam": team_name}
)
else:
for i, fixture_date_obj in enumerate(fixture_dates_final):
if i >= len(opponents): # Should not happen with the logic above
break
date_str = fixture_date_obj.strftime("%Y-%m-%d")
if i % 2 == 0: # Home game
fixtures.append(
{"date": date_str, "homeTeam": team_name, "awayTeam": opponents[i]}
)
else: # Away game
fixtures.append(
{"date": date_str, "homeTeam": opponents[i], "awayTeam": team_name}
)
return {"fixtures": fixtures}

View File

@@ -1,9 +1,11 @@
import os
import json
import http.client
from dotenv import load_dotenv
import json
import os
import random
import urllib.parse
from dotenv import load_dotenv
def search_airport(query: str) -> list:
"""
@@ -11,7 +13,7 @@ def search_airport(query: str) -> list:
"""
load_dotenv(override=True)
api_key = os.getenv("RAPIDAPI_KEY", "YOUR_DEFAULT_KEY")
api_host = os.getenv("RAPIDAPI_HOST", "sky-scrapper.p.rapidapi.com")
api_host = os.getenv("RAPIDAPI_HOST_FLIGHTS", "sky-scrapper.p.rapidapi.com")
conn = http.client.HTTPSConnection(api_host)
headers = {
@@ -73,7 +75,7 @@ def search_flights_real_api(
# Step 2: Call flight search with resolved codes
load_dotenv(override=True)
api_key = os.getenv("RAPIDAPI_KEY", "YOUR_DEFAULT_KEY")
api_host = os.getenv("RAPIDAPI_HOST", "sky-scrapper.p.rapidapi.com")
api_host = os.getenv("RAPIDAPI_HOST_FLIGHTS", "sky-scrapper.p.rapidapi.com")
conn = http.client.HTTPSConnection(api_host)
headers = {
@@ -173,45 +175,166 @@ def search_flights_real_api(
}
def generate_smart_flights(origin: str, destination: str) -> list:
"""
Generate realistic flight options with smart pricing based on origin and destination.
"""
# Common airlines for different regions
airlines_by_region = {
"domestic_us": [
{"name": "American Airlines", "code": "AA"},
{"name": "United Airlines", "code": "UA"},
{"name": "Delta Airlines", "code": "DL"},
{"name": "Southwest Airlines", "code": "WN"},
],
"us_international": [
{"name": "American Airlines", "code": "AA"},
{"name": "United Airlines", "code": "UA"},
{"name": "Delta Airlines", "code": "DL"},
{"name": "Virgin Atlantic", "code": "VS"},
],
"australia_nz": [
{"name": "Qantas", "code": "QF"},
{"name": "Jetstar", "code": "JQ"},
{"name": "Virgin Australia", "code": "VA"},
{"name": "Air New Zealand", "code": "NZ"},
],
"international": [
{"name": "American Airlines", "code": "AA"},
{"name": "United Airlines", "code": "UA"},
{"name": "Delta Airlines", "code": "DL"},
{"name": "Air New Zealand", "code": "NZ"},
{"name": "Qantas", "code": "QF"},
{"name": "Singapore Airlines", "code": "SQ"},
],
}
# Determine route type and base pricing
origin_lower = origin.lower()
dest_lower = destination.lower()
# Australia/NZ cities
anz_cities = [
"sydney",
"melbourne",
"syd",
"mel",
"auckland",
"akl",
"wellington",
"wlg",
"brisbane",
"bne",
"perth",
"per",
]
# US cities
us_cities = [
"los angeles",
"lax",
"san francisco",
"sfo",
"new york",
"nyc",
"jfk",
"chicago",
"ord",
"miami",
"mia",
]
is_origin_anz = any(city in origin_lower for city in anz_cities)
is_dest_anz = any(city in dest_lower for city in anz_cities)
is_origin_us = any(city in origin_lower for city in us_cities)
is_dest_us = any(city in dest_lower for city in us_cities)
# Determine airline pool and base price
if (is_origin_us and is_dest_anz) or (is_origin_anz and is_dest_us):
# Trans-Pacific routes
airline_pool = airlines_by_region["international"]
base_price = random.randint(1200, 1800)
elif is_origin_anz and is_dest_anz:
# Australia/NZ domestic
airline_pool = airlines_by_region["australia_nz"]
base_price = random.randint(300, 600)
elif is_origin_us and is_dest_us:
# US domestic
airline_pool = airlines_by_region["domestic_us"]
base_price = random.randint(200, 800)
else:
# General international
airline_pool = airlines_by_region["international"]
base_price = random.randint(800, 1500)
# Generate 3-4 flight options
num_flights = random.randint(3, 4)
results = []
used_airlines = set()
for i in range(num_flights):
# Pick unique airline
available_airlines = [a for a in airline_pool if a["name"] not in used_airlines]
if not available_airlines:
available_airlines = airline_pool # Reset if we run out
airline = random.choice(available_airlines)
used_airlines.add(airline["name"])
# Generate flight numbers
outbound_num = random.randint(100, 999)
return_num = random.randint(100, 999)
# Price variation (cheaper airlines get lower prices)
price_multiplier = 1.0
if "Southwest" in airline["name"] or "Jetstar" in airline["name"]:
price_multiplier = 0.7
elif "Virgin" in airline["name"]:
price_multiplier = 0.85
elif "Singapore" in airline["name"]:
price_multiplier = 1.2
# Add some random variation
price_variation = random.uniform(0.9, 1.1)
final_price = round(base_price * price_multiplier * price_variation, 2)
results.append(
{
"operating_carrier": airline["name"],
"outbound_flight_code": f"{airline['code']}{outbound_num}",
"price": final_price,
"return_flight_code": f"{airline['code']}{return_num}",
"return_operating_carrier": airline["name"],
}
)
# Sort by price
results.sort(key=lambda x: x["price"])
return results
def search_flights(args: dict) -> dict:
"""
Returns example flight search results in the requested JSON format.
Search for flights. Uses real API if RAPIDAPI_KEY is available, otherwise generates smart mock data.
"""
load_dotenv(override=True)
api_key = os.getenv("RAPIDAPI_KEY")
origin = args.get("origin")
destination = args.get("destination")
if not origin or not destination:
return {"error": "Both origin and destination are required"}
# If API key is available, use the real API
if api_key and api_key != "YOUR_DEFAULT_KEY":
return search_flights_real_api(args)
# Otherwise, generate smart mock data
results = generate_smart_flights(origin, destination)
return {
"currency": "USD",
"destination": f"{destination}",
"origin": f"{origin}",
"results": [
{
"operating_carrier": "American Airlines",
"outbound_flight_code": "AA203",
"price": 1262.51,
"return_flight_code": "AA202",
"return_operating_carrier": "American Airlines",
},
{
"operating_carrier": "Air New Zealand",
"outbound_flight_code": "NZ488",
"price": 1396.00,
"return_flight_code": "NZ527",
"return_operating_carrier": "Air New Zealand",
},
{
"operating_carrier": "United Airlines",
"outbound_flight_code": "UA100",
"price": 1500.00,
"return_flight_code": "UA101",
"return_operating_carrier": "United Airlines",
},
{
"operating_carrier": "Delta Airlines",
"outbound_flight_code": "DL200",
"price": 1600.00,
"return_flight_code": "DL201",
"return_operating_carrier": "Delta Airlines",
},
],
"destination": destination,
"origin": origin,
"results": results,
}

View File

@@ -1,8 +1,65 @@
from models.tool_definitions import ToolDefinition, ToolArgument
from typing import Dict, List
from models.tool_definitions import ToolArgument, ToolDefinition
# ----- System tools -----
list_agents_tool = ToolDefinition(
name="ListAgents",
description="List available agents to interact with, pulled from goal_registry. ",
arguments=[],
)
change_goal_tool = ToolDefinition(
name="ChangeGoal",
description="Change the goal of the active agent. ",
arguments=[
ToolArgument(
name="goalID",
type="string",
description="Which goal to change to",
),
],
)
give_hint_tool = ToolDefinition(
name="GiveHint",
description="Give a hint to the user regarding the location of the pirate treasure. Use previous conversation to determine the hint_total, it should initially be 0 ",
arguments=[
ToolArgument(
name="hint_total",
type="number",
description="How many hints have been given",
),
],
)
guess_location_tool = ToolDefinition(
name="GuessLocation",
description="Allow the user to guess the location (in the form of an address) of the pirate treasure. ",
arguments=[
ToolArgument(
name="address",
type="string",
description="Address at which the user is guessing the treasure is located",
),
ToolArgument(
name="city",
type="string",
description="City at which the user is guessing the treasure is located",
),
ToolArgument(
name="state",
type="string",
description="State at which the user is guessing the treasure is located",
),
],
)
# ----- Travel use cases tools -----
search_flights_tool = ToolDefinition(
name="SearchFlights",
description="Search for return flights from an origin to a destination within a date range (dateDepart, dateReturn).",
description="Search for return flights from an origin to a destination within a date range (dateDepart, dateReturn). "
"You are allowed to suggest dates from the conversation history, but ALWAYS ask the user if ok.",
arguments=[
ToolArgument(
name="origin",
@@ -24,12 +81,18 @@ search_flights_tool = ToolDefinition(
type="ISO8601",
description="End of date range in human readable format, when you want to return",
),
ToolArgument(
name="userConfirmation",
type="string",
description="Indication of the user's desire to search flights, and to confirm the details "
+ "before moving on to the next step",
),
],
)
search_trains_tool = ToolDefinition(
name="SearchTrains",
description="Search for trains between two English cities. Returns a list of train information for the user to choose from.",
description="Search for trains between two English cities. Returns a list of train information for the user to choose from. Present the list to the user.",
arguments=[
ToolArgument(
name="origin",
@@ -63,6 +126,11 @@ book_trains_tool = ToolDefinition(
type="string",
description="The IDs of the trains to book, comma separated",
),
ToolArgument(
name="userConfirmation",
type="string",
description="Indication of user's desire to book train tickets",
),
],
)
@@ -80,12 +148,17 @@ create_invoice_tool = ToolDefinition(
type="string",
description="A description of the item details to be invoiced, inferred from the conversation history.",
),
ToolArgument(
name="userConfirmation",
type="string",
description="Indication of user's desire to create an invoice",
),
],
)
search_fixtures_tool = ToolDefinition(
name="SearchFixtures",
description="Search for upcoming fixtures for a given team within a date range inferred from the user's description. Valid teams this 24/25 season are Arsenal FC, Aston Villa FC, AFC Bournemouth, Brentford FC, Brighton & Hove Albion FC, Chelsea FC, Crystal Palace FC, Everton FC, Fulham FC, Ipswich Town FC, Leicester City FC, Liverpool FC, Manchester City FC, Manchester United FC, Newcastle United FC, Nottingham Forest FC, Southampton FC, Tottenham Hotspur FC, West Ham United FC, Wolverhampton Wanderers FC",
description="Search for upcoming fixtures for a given team within a date range inferred from the user's description. Ignore valid premier league dates. Valid teams this season are Arsenal FC, Aston Villa FC, AFC Bournemouth, Brentford FC, Brighton & Hove Albion FC, Chelsea FC, Crystal Palace FC, Everton FC, Fulham FC, Ipswich Town FC, Leicester City FC, Liverpool FC, Manchester City FC, Manchester United FC, Newcastle United FC, Nottingham Forest FC, Southampton FC, Tottenham Hotspur FC, West Ham United FC, Wolverhampton Wanderers FC",
arguments=[
ToolArgument(
name="team",
@@ -124,3 +197,277 @@ find_events_tool = ToolDefinition(
),
],
)
# ----- HR use cases tools -----
current_pto_tool = ToolDefinition(
name="CurrentPTO",
description="Find how much PTO a user currently has accrued. "
"Returns the number of hours and (calculated) number of days of PTO. ",
arguments=[
ToolArgument(
name="email",
type="string",
description="email address of user",
),
],
)
future_pto_calc_tool = ToolDefinition(
name="FuturePTOCalc",
description="Calculate if the user will have enough PTO as of their proposed date to accommodate the request. The proposed start and end dates should be in the future. "
"Returns a boolean enough_pto and how many hours of PTO they will have remaining if they take the proposed dates. ",
arguments=[
ToolArgument(
name="start_date",
type="string",
description="Start date of proposed PTO, sent in the form yyyy-mm-dd",
),
ToolArgument(
name="end_date",
type="string",
description="End date of proposed PTO, sent in the form yyyy-mm-dd",
),
ToolArgument(
name="email",
type="string",
description="email address of user",
),
],
)
book_pto_tool = ToolDefinition(
name="BookPTO",
description="Book PTO start and end date. Either 1) makes calendar item, or 2) sends calendar invite to self and boss? "
"Returns a success indicator. ",
arguments=[
ToolArgument(
name="start_date",
type="string",
description="Start date of proposed PTO, sent in the form yyyy-mm-dd",
),
ToolArgument(
name="end_date",
type="string",
description="End date of proposed PTO, sent in the form yyyy-mm-dd",
),
ToolArgument(
name="email",
type="string",
description="Email address of user, used to look up current PTO",
),
ToolArgument(
name="userConfirmation",
type="string",
description="Indication of user's desire to book PTO",
),
],
)
paycheck_bank_integration_status_check = ToolDefinition(
name="CheckPayBankStatus",
description="Check status of Bank Integration for Paychecks. "
"Returns the status of the bank integration, connected or disconnected. ",
arguments=[
ToolArgument(
name="email",
type="string",
description="email address of user",
),
],
)
# ----- Financial use cases tools -----
financial_check_account_is_valid = ToolDefinition(
name="FinCheckAccountIsValid",
description="Check if an account is valid by email address or account ID. "
"Returns the account status, valid or invalid. ",
arguments=[
ToolArgument(
name="email",
type="string",
description="email address of user",
),
ToolArgument(
name="account_id",
type="string",
description="account ID of user",
),
],
)
financial_get_account_balances = ToolDefinition(
name="FinCheckAccountBalance",
description="Get account balance for your accounts. "
"Returns the account balances of your accounts. ",
arguments=[
ToolArgument(
name="email_address_or_account_ID",
type="string",
description="email address or account ID of user",
),
],
)
financial_move_money = ToolDefinition(
name="FinMoveMoney",
description="Send money from one account to another under the same acount ID (e.g. checking to savings). "
"Returns the status of the order and the new balances in each account. ",
arguments=[
ToolArgument(
name="email_address_or_account_ID",
type="string",
description="email address or account ID of user (you will need both to find the account)",
),
ToolArgument(
name="accounttype",
type="string",
description="account type, such as checking or savings",
),
ToolArgument(
name="amount",
type="string",
description="amount to move in the order (e.g. checking or savings)",
),
ToolArgument(
name="destinationaccount",
type="string",
description="account to move the money to (e.g. checking or savings)",
),
ToolArgument(
name="userConfirmation",
type="string",
description="Indication of user's desire to move money",
),
],
)
financial_submit_loan_approval = ToolDefinition(
name="FinCheckAccountSubmitLoanApproval",
description="Submit a loan application. " "Returns the loan status. ",
arguments=[
ToolArgument(
name="email_address_or_account_ID",
type="string",
description="email address or account ID of user",
),
ToolArgument(
name="amount",
type="string",
description="amount requested for the loan",
),
],
)
# ----- ECommerce Use Case Tools -----
ecomm_list_orders = ToolDefinition(
name="ListOrders",
description="Get all orders for a certain email address.",
arguments=[
ToolArgument(
name="email_address",
type="string",
description="Email address of user by which to find orders",
),
],
)
ecomm_get_order = ToolDefinition(
name="GetOrder",
description="Get infromation about an order by order ID.",
arguments=[
ToolArgument(
name="order_id",
type="string",
description="ID of order to determine status of",
),
],
)
ecomm_track_package = ToolDefinition(
name="TrackPackage",
description="Get tracking information for a package by shipping provider and tracking ID",
arguments=[
ToolArgument(
name="tracking_id",
type="string",
description="ID of package to track",
),
ToolArgument(
name="userConfirmation",
type="string",
description="Indication of user's desire to get package tracking information",
),
],
)
# ----- Food Ordering Use Case Tools -----
food_add_to_cart_tool = ToolDefinition(
name="AddToCart",
description="Add a menu item to the customer's cart using item details from Stripe.",
arguments=[
ToolArgument(
name="customer_email",
type="string",
description="Email address of the customer",
),
ToolArgument(
name="item_name",
type="string",
description="Name of the menu item (e.g., 'Margherita Pizza', 'Caesar Salad')",
),
ToolArgument(
name="item_price",
type="number",
description="Price of the item in dollars (e.g., 14.99)",
),
ToolArgument(
name="quantity",
type="number",
description="Quantity of the item to add (defaults to 1)",
),
ToolArgument(
name="stripe_product_id",
type="string",
description="Stripe product ID for reference (optional)",
),
],
)
# MCP Integration Functions
def create_mcp_tool_definitions(
mcp_tools_info: Dict[str, Dict],
) -> List[ToolDefinition]:
"""Convert MCP tool info to ToolDefinition objects"""
tool_definitions = []
for tool_name, tool_info in mcp_tools_info.items():
# Extract input schema properties
input_schema = tool_info.get("inputSchema", {})
properties = (
input_schema.get("properties", {}) if isinstance(input_schema, dict) else {}
)
# Convert properties to ToolArgument objects
arguments = []
for param_name, param_info in properties.items():
if isinstance(param_info, dict):
arguments.append(
ToolArgument(
name=param_name,
type=param_info.get("type", "string"),
description=param_info.get("description", ""),
)
)
# Create ToolDefinition
tool_def = ToolDefinition(
name=tool_info["name"],
description=tool_info.get("description", ""),
arguments=arguments,
)
tool_definitions.append(tool_def)
return tool_definitions

View File

@@ -0,0 +1,7 @@
import shared.config
def transfer_control(args: dict) -> dict:
return {
"new_goal": shared.config.AGENT_GOAL,
}

View File

@@ -1,33 +1,43 @@
from collections import deque
from datetime import timedelta
from typing import Dict, Any, Union, List, Optional, Deque, TypedDict
from typing import Any, Deque, Dict, List, Optional, TypedDict, Union
from temporalio.common import RetryPolicy
from temporalio import workflow
from temporalio.common import RetryPolicy
from models.data_types import ConversationHistory, NextStep, ValidationInput
from workflows.workflow_helpers import LLM_ACTIVITY_START_TO_CLOSE_TIMEOUT, \
LLM_ACTIVITY_SCHEDULE_TO_CLOSE_TIMEOUT
from models.data_types import (
ConversationHistory,
EnvLookupInput,
EnvLookupOutput,
NextStep,
ValidationInput,
)
from models.tool_definitions import AgentGoal
from workflows import workflow_helpers as helpers
from workflows.workflow_helpers import (
LLM_ACTIVITY_SCHEDULE_TO_CLOSE_TIMEOUT,
LLM_ACTIVITY_START_TO_CLOSE_TIMEOUT,
)
with workflow.unsafe.imports_passed_through():
from activities.tool_activities import ToolActivities
from prompts.agent_prompt_generators import (
generate_genai_prompt
)
from models.data_types import (
CombinedInput,
ToolPromptInput,
)
from activities.tool_activities import ToolActivities, mcp_list_tools
from goals import goal_list
from models.data_types import CombinedInput, ToolPromptInput
from prompts.agent_prompt_generators import generate_genai_prompt
from tools.tool_registry import create_mcp_tool_definitions
# Constants
MAX_TURNS_BEFORE_CONTINUE = 250
# ToolData as part of the workflow is what's accessible to the UI - see LLMResponse.jsx for example
class ToolData(TypedDict, total=False):
next: NextStep
tool: str
args: Dict[str, Any]
response: str
force_confirm: bool = True
@workflow.defn
class AgentGoalWorkflow:
@@ -39,15 +49,34 @@ class AgentGoalWorkflow:
self.conversation_summary: Optional[str] = None
self.chat_ended: bool = False
self.tool_data: Optional[ToolData] = None
self.confirm: bool = False
self.confirmed: bool = (
False # indicates that we have confirmation to proceed to run tool
)
self.tool_results: List[Dict[str, Any]] = []
self.goal: AgentGoal = {"tools": []}
self.show_tool_args_confirmation: bool = (
True # set from env file in activity lookup_wf_env_settings
)
self.multi_goal_mode: bool = (
False # set from env file in activity lookup_wf_env_settings
)
self.mcp_tools_info: Optional[dict] = None # stores complete MCP tools result
# see ../api/main.py#temporal_client.start_workflow() for how the input parameters are set
@workflow.run
async def run(self, combined_input: CombinedInput) -> str:
"""Main workflow execution method."""
# setup phase, starts with blank tool_params and agent_goal prompt as defined in tools/goal_registry.py
params = combined_input.tool_params
agent_goal = combined_input.agent_goal
self.goal = combined_input.agent_goal
await self.lookup_wf_env_settings(combined_input)
# If the goal has an MCP server definition, dynamically load MCP tools
if self.goal.mcp_server_definition:
await self.load_mcp_tools()
# add message from sample conversation provided in tools/goal_registry.py, if it exists
if params and params.conversation_summary:
self.add_message("conversation_summary", params.conversation_summary)
self.conversation_summary = params.conversation_summary
@@ -58,44 +87,46 @@ class AgentGoalWorkflow:
waiting_for_confirm = False
current_tool = None
# This is the main interactive loop. Main responsibilities:
# - Selecting and changing goals as directed by the user
# - reacting to user input (from signals)
# - validating user input to make sure it makes sense with the current goal and tools
# - calling the LLM through activities to determine next steps and prompts
# - executing the selected tools via activities
while True:
# wait indefinitely for input from signals - user_prompt, end_chat, or confirm as defined below
await workflow.wait_condition(
lambda: bool(self.prompt_queue) or self.chat_ended or self.confirm
lambda: bool(self.prompt_queue) or self.chat_ended or self.confirmed
)
if self.chat_ended:
workflow.logger.info("Chat ended.")
# handle chat should end. When chat ends, push conversation history to workflow results.
if self.chat_should_end():
return f"{self.conversation_history}"
if self.confirm and waiting_for_confirm and current_tool and self.tool_data:
self.confirm = False
waiting_for_confirm = False
confirmed_tool_data = self.tool_data.copy()
confirmed_tool_data["next"] = "user_confirmed_tool_run"
self.add_message("user_confirmed_tool_run", confirmed_tool_data)
await helpers.handle_tool_execution(
current_tool,
self.tool_data,
self.tool_results,
self.add_message,
self.prompt_queue
)
# Execute the tool
if self.ready_for_tool_execution(waiting_for_confirm, current_tool):
waiting_for_confirm = await self.execute_tool(current_tool)
continue
# process forward on the prompt queue if any
if self.prompt_queue:
# get most recent prompt
prompt = self.prompt_queue.popleft()
if not prompt.startswith("###"):
workflow.logger.info(
f"workflow step: processing message on the prompt queue, message is {prompt}"
)
# Validate user-provided prompts
if self.is_user_prompt(prompt):
self.add_message("user", prompt)
# Validate the prompt before proceeding
validation_input = ValidationInput(
prompt=prompt,
conversation_history=self.conversation_history,
agent_goal=agent_goal,
agent_goal=self.goal,
)
validation_result = await workflow.execute_activity(
validation_result = await workflow.execute_activity_method(
ToolActivities.agent_validatePrompt,
args=[validation_input],
schedule_to_close_timeout=LLM_ACTIVITY_SCHEDULE_TO_CLOSE_TIMEOUT,
@@ -105,6 +136,7 @@ class AgentGoalWorkflow:
),
)
# If validation fails, provide that feedback to the user - i.e., "your words make no sense, puny human" end this iteration of processing
if not validation_result.validationResult:
workflow.logger.warning(
f"Prompt validation failed: {validation_result.validationFailedReason}"
@@ -114,17 +146,21 @@ class AgentGoalWorkflow:
)
continue
# Proceed with generating the context and prompt
# If valid, proceed with generating the context and prompt
context_instructions = generate_genai_prompt(
agent_goal, self.conversation_history, self.tool_data
agent_goal=self.goal,
conversation_history=self.conversation_history,
multi_goal_mode=self.multi_goal_mode,
raw_json=self.tool_data,
mcp_tools_info=self.mcp_tools_info,
)
prompt_input = ToolPromptInput(
prompt=prompt,
context_instructions=context_instructions,
prompt=prompt, context_instructions=context_instructions
)
tool_data = await workflow.execute_activity(
# connect to LLM and execute to get next steps
tool_data = await workflow.execute_activity_method(
ToolActivities.agent_toolPlanner,
prompt_input,
schedule_to_close_timeout=LLM_ACTIVITY_SCHEDULE_TO_CLOSE_TIMEOUT,
@@ -133,61 +169,110 @@ class AgentGoalWorkflow:
initial_interval=timedelta(seconds=5), backoff_coefficient=1
),
)
tool_data["force_confirm"] = self.show_tool_args_confirmation
self.tool_data = tool_data
# process the tool as dictated by the prompt response - what to do next, and with which tool
next_step = tool_data.get("next")
current_tool = tool_data.get("tool")
workflow.logger.info(
f"next_step: {next_step}, current tool is {current_tool}"
)
# make sure we're ready to run the tool & have everything we need
if next_step == "confirm" and current_tool:
args = tool_data.get("args", {})
if await helpers.handle_missing_args(current_tool, args, tool_data, self.prompt_queue):
# if we're missing arguments, ask for them
if await helpers.handle_missing_args(
current_tool, args, tool_data, self.prompt_queue
):
continue
waiting_for_confirm = True
self.confirm = False
workflow.logger.info("Waiting for user confirm signal...")
# We have needed arguments, if we want to force the user to confirm, set that up
if self.show_tool_args_confirmation:
self.confirmed = False # set that we're not confirmed
workflow.logger.info("Waiting for user confirm signal...")
# if we have all needed arguments (handled above) and not holding for a debugging confirm, proceed:
else:
self.confirmed = True
# else if the next step is to pick a new goal, set that to be the goal
elif next_step == "pick-new-goal":
workflow.logger.info("All steps completed. Resetting goal.")
self.change_goal("goal_choose_agent_type")
# else if the next step is to be done with the conversation such as if the user requests it via asking to "end conversation"
elif next_step == "done":
workflow.logger.info("All steps completed. Exiting workflow.")
self.add_message("agent", tool_data)
# here we could send conversation to AI for analysis
# end the workflow
return str(self.conversation_history)
self.add_message("agent", tool_data)
await helpers.continue_as_new_if_needed(
self.conversation_history,
self.prompt_queue,
agent_goal,
self.goal,
MAX_TURNS_BEFORE_CONTINUE,
self.add_message
self.add_message,
)
# Signal that comes from api/main.py via a post to /send-prompt
@workflow.signal
async def user_prompt(self, prompt: str) -> None:
"""Signal handler for receiving user prompts."""
workflow.logger.info(f"signal received: user_prompt, prompt is {prompt}")
if self.chat_ended:
workflow.logger.warn(f"Message dropped due to chat closed: {prompt}")
workflow.logger.info(f"Message dropped due to chat closed: {prompt}")
return
self.prompt_queue.append(prompt)
# Signal that comes from api/main.py via a post to /confirm
@workflow.signal
async def confirm(self) -> None:
"""Signal handler for user confirmation of tool execution."""
workflow.logger.info("Received user confirmation")
self.confirm = True
workflow.logger.info("Received user signal: confirmation")
self.confirmed = True
# Signal that comes from api/main.py via a post to /end-chat
@workflow.signal
async def end_chat(self) -> None:
"""Signal handler for ending the chat session."""
workflow.logger.info("signal received: end_chat")
self.chat_ended = True
# Signal that can be sent from Temporal Workflow UI to enable debugging confirm and override .env setting
@workflow.signal
async def enable_debugging_confirm(self) -> None:
"""Signal handler for enabling debugging confirm UI & associated logic."""
workflow.logger.info("signal received: enable_debugging_confirm")
self.enable_debugging_confirm = True
# Signal that can be sent from Temporal Workflow UI to disable debugging confirm and override .env setting
@workflow.signal
async def disable_debugging_confirm(self) -> None:
"""Signal handler for disabling debugging confirm UI & associated logic."""
workflow.logger.info("signal received: disable_debugging_confirm")
self.enable_debugging_confirm = False
@workflow.query
def get_conversation_history(self) -> ConversationHistory:
"""Query handler to retrieve the full conversation history."""
return self.conversation_history
@workflow.query
def get_agent_goal(self) -> AgentGoal:
"""Query handler to retrieve the current goal of the agent."""
return self.goal
@workflow.query
def get_summary_from_history(self) -> Optional[str]:
"""Query handler to retrieve the conversation summary if available.
"""Query handler to retrieve the conversation summary if available.
Used only for continue as new of the workflow."""
return self.conversation_summary
@@ -212,3 +297,152 @@ class AgentGoalWorkflow:
self.conversation_history["messages"].append(
{"actor": actor, "response": response}
)
def change_goal(self, goal: str) -> None:
"""Change the goal (usually on request of the user).
Args:
goal: goal to change to)
"""
if goal is not None:
for listed_goal in goal_list:
if listed_goal.id == goal:
self.goal = listed_goal
workflow.logger.info("Changed goal to " + goal)
if goal is None:
workflow.logger.warning(
"Goal not set after goal reset, probably bad."
) # if this happens, there's probably a problem with the goal list
# workflow function that defines if chat should end
def chat_should_end(self) -> bool:
if self.chat_ended:
workflow.logger.info("Chat-end signal received. Chat ending.")
return True
else:
return False
# define if we're ready for tool execution
def ready_for_tool_execution(
self, waiting_for_confirm: bool, current_tool: Any
) -> bool:
if self.confirmed and waiting_for_confirm and current_tool and self.tool_data:
return True
else:
return False
# LLM-tagged prompts start with "###"
# all others are from the user
def is_user_prompt(self, prompt) -> bool:
if prompt.startswith("###"):
return False
else:
return True
# look up env settings in an activity so they're part of history
async def lookup_wf_env_settings(self, combined_input: CombinedInput) -> None:
env_lookup_input = EnvLookupInput(
show_confirm_env_var_name="SHOW_CONFIRM",
show_confirm_default=True,
)
env_output: EnvLookupOutput = await workflow.execute_activity_method(
ToolActivities.get_wf_env_vars,
env_lookup_input,
start_to_close_timeout=LLM_ACTIVITY_START_TO_CLOSE_TIMEOUT,
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=5), backoff_coefficient=1
),
)
self.show_tool_args_confirmation = env_output.show_confirm
self.multi_goal_mode = env_output.multi_goal_mode
# execute the tool - return False if we're not waiting for confirm anymore (always the case if it works successfully)
#
async def execute_tool(self, current_tool: str) -> bool:
workflow.logger.info(
f"workflow step: user has confirmed, executing the tool {current_tool}"
)
self.confirmed = False
waiting_for_confirm = False
confirmed_tool_data = self.tool_data.copy()
confirmed_tool_data["next"] = "user_confirmed_tool_run"
self.add_message("user_confirmed_tool_run", confirmed_tool_data)
# execute the tool by key as defined in tools/__init__.py
await helpers.handle_tool_execution(
current_tool,
self.tool_data,
self.tool_results,
self.add_message,
self.prompt_queue,
self.goal,
)
# set new goal if we should
if len(self.tool_results) > 0:
if (
"ChangeGoal" in self.tool_results[-1].values()
and "new_goal" in self.tool_results[-1].keys()
):
new_goal = self.tool_results[-1].get("new_goal")
self.change_goal(new_goal)
elif (
"ListAgents" in self.tool_results[-1].values()
and self.goal.id != "goal_choose_agent_type"
):
self.change_goal("goal_choose_agent_type")
return waiting_for_confirm
# debugging helper - drop this in various places in the workflow to get status
# also don't forget you can look at the workflow itself and do queries if you want
def print_useful_workflow_vars(self, status_or_step: str) -> None:
print(f"***{status_or_step}:***")
if self.goal:
print(f"current goal: {self.goal.id}")
if self.tool_data:
print(f"force confirm? {self.tool_data['force_confirm']}")
print(f"next step: {self.tool_data.get('next')}")
print(f"current_tool: {self.tool_data.get('tool')}")
else:
print("no tool data initialized yet")
print(f"self.confirmed: {self.confirmed}")
async def load_mcp_tools(self) -> None:
"""Load MCP tools dynamically from the server definition"""
if not self.goal.mcp_server_definition:
return
workflow.logger.info(
f"Loading MCP tools from server: {self.goal.mcp_server_definition.name}"
)
# Get the list of tools to include (if specified)
include_tools = self.goal.mcp_server_definition.included_tools
# Call the MCP list tools activity
mcp_tools_result = await workflow.execute_activity(
mcp_list_tools,
args=[self.goal.mcp_server_definition, include_tools],
start_to_close_timeout=LLM_ACTIVITY_START_TO_CLOSE_TIMEOUT,
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=5), backoff_coefficient=1
),
summary=f"{self.goal.mcp_server_definition.name}",
)
if mcp_tools_result.get("success", False):
tools_info = mcp_tools_result.get("tools", {})
workflow.logger.info(f"Successfully loaded {len(tools_info)} MCP tools")
# Store complete MCP tools result for use in prompt generation
self.mcp_tools_info = mcp_tools_result
# Convert MCP tools to ToolDefinition objects and add to goal
mcp_tool_definitions = create_mcp_tool_definitions(tools_info)
self.goal.tools.extend(mcp_tool_definitions)
workflow.logger.info(f"Added {len(mcp_tool_definitions)} MCP tools to goal")
else:
error_msg = mcp_tools_result.get("error", "Unknown error")
workflow.logger.error(f"Failed to load MCP tools: {error_msg}")
# Continue execution without MCP tools

View File

@@ -1,10 +1,12 @@
from datetime import timedelta
from typing import Dict, Any, Deque
from typing import Any, Deque, Dict
from temporalio import workflow
from temporalio.exceptions import ActivityError
from temporalio.common import RetryPolicy
from temporalio.exceptions import ActivityError
from models.data_types import ConversationHistory, ToolPromptInput
from models.tool_definitions import AgentGoal
from prompts.agent_prompt_generators import (
generate_missing_args_prompt,
generate_tool_completion_prompt,
@@ -18,35 +20,118 @@ LLM_ACTIVITY_START_TO_CLOSE_TIMEOUT = timedelta(seconds=20)
LLM_ACTIVITY_SCHEDULE_TO_CLOSE_TIMEOUT = timedelta(minutes=30)
def is_mcp_tool(tool_name: str, goal: AgentGoal) -> bool:
"""Check if a tool is an MCP tool based on the goal's MCP server definition"""
if not goal.mcp_server_definition:
return False
# Check if the tool name matches any MCP tools that were loaded
# We can identify MCP tools by checking if they're not in the original static tools
from tools.tool_registry import (
book_pto_tool,
book_trains_tool,
change_goal_tool,
create_invoice_tool,
current_pto_tool,
ecomm_get_order,
ecomm_list_orders,
ecomm_track_package,
financial_check_account_is_valid,
financial_get_account_balances,
financial_move_money,
financial_submit_loan_approval,
find_events_tool,
food_add_to_cart_tool,
future_pto_calc_tool,
give_hint_tool,
guess_location_tool,
list_agents_tool,
paycheck_bank_integration_status_check,
search_fixtures_tool,
search_flights_tool,
search_trains_tool,
)
static_tool_names = {
list_agents_tool.name,
change_goal_tool.name,
give_hint_tool.name,
guess_location_tool.name,
search_flights_tool.name,
search_trains_tool.name,
book_trains_tool.name,
create_invoice_tool.name,
search_fixtures_tool.name,
find_events_tool.name,
current_pto_tool.name,
future_pto_calc_tool.name,
book_pto_tool.name,
paycheck_bank_integration_status_check.name,
financial_check_account_is_valid.name,
financial_get_account_balances.name,
financial_move_money.name,
financial_submit_loan_approval.name,
ecomm_list_orders.name,
ecomm_get_order.name,
ecomm_track_package.name,
food_add_to_cart_tool.name,
}
return tool_name not in static_tool_names
async def handle_tool_execution(
current_tool: str,
tool_data: Dict[str, Any],
tool_results: list,
add_message_callback: callable,
prompt_queue: Deque[str],
goal: AgentGoal = None,
) -> None:
"""Execute a tool after confirmation and handle its result."""
workflow.logger.info(f"Confirmed. Proceeding with tool: {current_tool}")
task_queue = (
TEMPORAL_LEGACY_TASK_QUEUE
if current_tool in ["SearchTrains", "BookTrains"]
else None
)
try:
dynamic_result = await workflow.execute_activity(
current_tool,
tool_data["args"],
task_queue=task_queue,
schedule_to_close_timeout=TOOL_ACTIVITY_SCHEDULE_TO_CLOSE_TIMEOUT,
start_to_close_timeout=TOOL_ACTIVITY_START_TO_CLOSE_TIMEOUT,
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=5), backoff_coefficient=1
),
)
# Check if this is an MCP tool
if goal and is_mcp_tool(current_tool, goal):
workflow.logger.info(f"Executing MCP tool: {current_tool}")
# Add server definition to args for MCP tools
mcp_args = tool_data["args"].copy()
mcp_args["server_definition"] = goal.mcp_server_definition
dynamic_result = await workflow.execute_activity(
current_tool,
mcp_args,
schedule_to_close_timeout=TOOL_ACTIVITY_SCHEDULE_TO_CLOSE_TIMEOUT,
start_to_close_timeout=TOOL_ACTIVITY_START_TO_CLOSE_TIMEOUT,
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=5), backoff_coefficient=1
),
summary=f"{goal.mcp_server_definition.name} (MCP Tool)",
)
else:
# Handle regular tools
task_queue = (
TEMPORAL_LEGACY_TASK_QUEUE
if current_tool in ["SearchTrains", "BookTrains"]
else None
)
dynamic_result = await workflow.execute_activity(
current_tool,
tool_data["args"],
task_queue=task_queue,
schedule_to_close_timeout=TOOL_ACTIVITY_SCHEDULE_TO_CLOSE_TIMEOUT,
start_to_close_timeout=TOOL_ACTIVITY_START_TO_CLOSE_TIMEOUT,
retry_policy=RetryPolicy(
initial_interval=timedelta(seconds=5), backoff_coefficient=1
),
)
dynamic_result["tool"] = current_tool
tool_results.append(dynamic_result)
except ActivityError as e:
workflow.logger.error(f"Tool execution failed: {str(e)}")
dynamic_result = {"error": str(e), "tool": current_tool}