Module 1 (Day 2) — From LLMs to Agents | AnyCompany Support Workshop

The Question Every Cyborg Builder Should Answer

Your team uses Claude or AnyCompany GPT every day. They draft case context summary, summarise contracts, classify expense categories. That's a chatbot. When the work involves checking a system, deciding what to do, and taking action across multiple steps — chatbots fall over. That gap is what agents are built to close.

💬

What chatbots are great at

Reading text, drafting text, transforming text. Single-turn, single-pass, single-output tasks where the human reads the answer and decides what to do next.

🚧

Where chatbots run out of road

The moment the task needs fresh data ("what's the current FX rate?"), system access ("look up this PO"), or action ("flag this invoice for approval") — the chatbot can only describe what should happen, not make it happen.

🤖

What an agent adds

An agent is an LLM plus tools, plus a reasoning loop. It can check live systems, decide which step is next, take that step, observe the result, and decide what to do next — all without re-prompting the human at each step.

A Concrete Failure — "Where's My Refund?"

Imagine an L1 IRT agent asks GrabGPT: "Pax on case BK-2026-4821 says they got a partial GrabFood refund of SGD 24.50 last Tuesday but it hasn't shown in their wallet. Pax wants confirmation today. What should I tell them?"

A standalone LLM can write a polite reply — but it cannot check the wallet ledger, query D365 for the case status, verify the refund queue, or trigger a status update to the Pax. Watch the failure cascade:

// Pax escalation received via Live Chat... check_wallet_ledger("BK-2026-4821") → ❌ NO D365 ACCESS query_refund_queue("MIWI-pending") → ❌ NO REAL-TIME DATA post_status_to_pax("P-99421") → ❌ NO ACTION CAPABILITY // Best the LLM can offer: "I'm sorry for the inconvenience. Please contact support..."

Why "Just Ask the LLM Harder" Doesn't Work

The temptation is to write a longer, more detailed prompt. That fixes nothing. The chatbot still has no eyes on your systems, no hands on your tools, and no working memory between calls. The wall is structural, not prompt-quality.

No real-time dataCan't see the current FX rate, today's chargeback queue, or last hour's transactions.

No actionsCan't post a journal entry, flag a transaction, or send an SLA-bound notification.

No memory between callsForgets what it just did. Re-asks for context every turn at scale.

Hallucinations under pressureWhen uncertain, invents plausible-but-wrong numbers, citations, or policy references.

One-shot reasoningCan't say "check this, then if X, do Y, otherwise do Z" reliably across many steps.

Cost at scaleLong prompts × many concurrent users = a budget incident waiting to happen.

📈 The Cyborg builder read: Chatbots are a productivity tool — they make humans faster at single tasks. Agents are an operational tool — they take work off the human's plate entirely. The shift isn't about "better AI". It's about which jobs you can hand to a system that runs without you.

The Four Ideas That Turn an LLM Into an Agent

Agents didn't appear from a single breakthrough — they emerged from four innovations that, combined, broke through the chatbot wall. Click any card to see why each one matters.

🔌

1. Tool Use (Function Calling)

The LLM can call your APIs and read the result.

🧠

2. Chain-of-Thought Reasoning

Break a goal into steps, then run them.

👁️

3. Multimodal Understanding

Read PDFs, photos, screenshots — not just text.

🏗️

4. Agent Frameworks

The plumbing that runs the loop for you.

🔌 Tool Use — The Core Unlock

By default an LLM only produces text. Tool use changes the contract: the model is told "these functions exist — get_invoice(id), check_po(id), post_journal(entry) — call any of them when you need to". The LLM responds with a structured tool call; the framework executes it and feeds the result back. The loop continues until the goal is met.

For GS Support: Your existing systems — D365 case lookups, datalake queries, NICE IEX status, the SOP / KB corpus, Slack escalation channels — become things the agent can use. The LLM does the reasoning ("which cases need follow-up?", "which SOP applies here?"), your code does the action (the actual D365 query, the actual Slack post). You keep control of the action layer.

The Four Phases — From Answering to Operating

AI for the enterprise has moved through four distinct phases. Each one keeps everything from the previous, then adds a capability. Click any phase to see what changed and how it shows up in finance.

📝

LLMs

Text in → Text out

💬

GenAI Assistants

+ Memory & RAG

🤖

GenAI Agents

+ Tools & Reasoning

🌐

Agentic Systems

+ Multi-Agent

📝 Pure LLMs — Text Generation Only

The starting point. The model takes a prompt and returns a completion based on patterns it learned during training. Fast, cheap, and powerful for transformations like summarisation, classification, and drafting. But entirely isolated — no eyes on your systems, no hands.

AnyCompany Support reality: A pure LLM can draft an stakeholder summary paragraph from numbers you paste, classify whether an expense is "T&E" or "Marketing", or rewrite a vendor letter for tone. Useful — but the numbers themselves still come from a human pulling reports.

A Single Direction — Increasing Autonomy

Each phase pushes more of the work onto the system and less onto the human. That's the only axis that matters: how much of the loop is the system running, and how much do you still have to do yourself?

Capability	📝 LLM	💬 Assistant	🤖 Agent	🌐 Agentic System
Reads your prompt	✅	✅	✅	✅
Remembers conversation	—	✅	✅	✅
Searches your documents	—	✅	✅	✅
Calls your systems	—	—	✅	✅
Plans multi-step work	—	—	✅	✅
Coordinates with other agents	—	—	—	✅

What This Means for AnyCompany Support

Every limitation of the LLM-only world maps to something GS Support teams already do by hand. Agents turn each one into a candidate for automation — with humans staying in the approval loop.

Before vs After — Vendor Escalation

Scenario	LLM only	Agent NEW
"Where's our Pax's refund on BK-2026-4821?"	Drafts a polite holding reply	Checks D365 case status + wallet ledger, calculates revised settlement window, posts confirmation back to Pax via the live-chat thread
"Forecast vs actual — IRT volume this week"	Summarises numbers you paste	Pulls actuals from NICE IEX, joins to last forecast, writes the commentary, flags the three intervals worth re-staffing
"How do I handle a partial-delivery refund in MY?"	Quotes general principles	Reads the case + market, cross-checks against the SOP corpus, drafts the cited answer with the refund threshold for the agent to apply
"Auto-tag PAC for last hour's MIWI cases"	Tells you what to look for	Reads 200 case descriptions, classifies Symptom L1/L2/L3 + Action, surfaces the 7 ambiguous ones for TL review, writes the rest to D365

Eight Agentic Use Cases Across the AnyCompany Cohort

Mapped to the eight functions in the room. Don't worry about which to pick yet — Day 2 ends with you choosing one for your team. These are the candidates worth knowing about.

📋Procurement — Contract Clause Reviewer

Reads incoming MSAs, flags clauses that deviate from your standard template, drafts the redline summary with severity ratings.

Workflow agentQuick win

🔍Audit — Control-Test Memo Drafter

Given the control + the evidence sample, drafts the design-effectiveness and operating-effectiveness memo, grounded in your control library.

Workflow agentQuick win

📊FP&A — Case Context Summary Writer

Pulls actuals + forecast + last quarter's commentary, writes the new month's narrative in house style, flags the three drivers worth a deeper look.

Hybrid agentFlagship

📑Reporting — Disclosure Note First-Drafter

From this period's movement schedules, drafts IFRS / local-GAAP notes consistent with prior periods, ready for the human signer.

Workflow agentFlagship

📚SOP Lookup — Agent Helpdesk

Answers "how do I handle this refund?" / "what's the SLA for a Safety case in TH?" against your SOP / KB corpus, with citation. Your Day 2 build target.

Workflow agentRAG-grounded

🏦Controllership — Period-Close Narrative

Auto-drafts the month-end commentary from the close pack. Checks intercompany balances, surfaces unreconciled items, suggests root cause.

Hybrid agentFlagship

💰Treasury — Weekly Cash Narrative

Pulls cash positions across entities, drafts the week's narrative (opening, in/out by category, exceptions), highlights covenant-relevant moves.

Hybrid agentQuick win

🗂️Data & Analytics — Ad-Hoc Query Translator

Finance user asks "top 10 vendors by Q2 spend" — agent converts to SQL, runs against the warehouse, narrates the result back. Reduces the data-request queue.

Workflow agentFlagship

🎯 Next: Module 2 zooms in on what kind of agent each of these is. Workflow vs autonomous vs hybrid vs multi-agent — different shapes, different trade-offs, different costs.

From Chatbots to Agents

The Question Every Cyborg Builder Should Answer

What chatbots are great at

Where chatbots run out of road

What an agent adds

A Concrete Failure — "Where's My Refund?"

Why "Just Ask the LLM Harder" Doesn't Work

The Four Ideas That Turn an LLM Into an Agent

1. Tool Use (Function Calling)

2. Chain-of-Thought Reasoning

3. Multimodal Understanding

4. Agent Frameworks

🔌 Tool Use — The Core Unlock

The Four Phases — From Answering to Operating

LLMs

GenAI Assistants

GenAI Agents

Agentic Systems

📝 Pure LLMs — Text Generation Only

A Single Direction — Increasing Autonomy

What This Means for AnyCompany Support

Before vs After — Vendor Escalation

Eight Agentic Use Cases Across the AnyCompany Cohort