Module 2 · Day 1

📘 Where AI Wins (and Where It Doesn't) for GS Support

Not every GS task needs Generative AI. Pick the wrong use case and you'll waste budget, erode trust, and add to the agent's tab-switching tax. This module gives you a decision framework, your cohort's actual Top 6 ideas mapped to the 10 GS themes, and a priority matrix you can take to your country GM this week.

Day 1: Foundation Decision Framework 10 GS Themes · Cohort Top 6 By Persona

The Question Every Cyborg Builder Should Answer First

Your cohort submitted 45 ideas during the Cyborg review process. The GSTF curators endorsed 7 as Top Ideas, sent 25 to "endorsed to proceed", and deprioritised 6. The ones that succeeded share three traits: they tackle unstructured work (chat transcripts, SOP lookups, case descriptions), they keep a human (agent / TL) in the loop, and they replace repeated cognitive effort — not one-off creative judgement.

⏱️

The cost of picking wrong

A pilot that runs for 6 months on the wrong use case doesn't just cost AHT impact — it costs credibility with your country GM and GSTF. Once Ops has seen "AI didn't work here," buy-in for the next attempt is twice as hard. Your OmniQ Bot teammate has felt this — see what GSTF said about the deprioritised ideas.

🎯

The shape of a winning Cyborg use case

High volume, repeated cognitive work, judgement-required (not pure math), tolerates a human review step, has bounded scope, and not on the GSTF roadmap. Case Context Summarizer fits. Calculating refund threshold breaches does not (that's a script).

🚦

Lead with use cases, not technology

"We need to use Valet / GrabGPT" is the wrong starting point. "Our TLs spend ~30 minutes per IRT escalation reading and writing the case context" is the right one. The use case picks the technology — never the other way around.

📈 What good looks like at GS scale: The GS H1'26 EBT study found agents spend significant Outside-D365 time (multi-system verification, drafting, tab-switching). Mikko's Case Context Summarizer targets 30 min → <10 min per case. Pornpailin's LiveAssist targets 15–25% AHT reduction via knowledge-search collapse. Sanidwong's LiveChat Sim already shipped with no engineering dependency. These aren't AI numbers — they're finding the right use case numbers.

The "Lead with Use Cases" Test — Cyborg Edition

Before any new Cyborg pitch, run it through these five questions. If you can't answer all five, the use case isn't ready for the GSTF review.

#QuestionWhat a good answer looks like
1What is the specific GS problem?"Our IRT TLs take ~30 min per stakeholder-escalation case to read D365, gather booking + history, and write the brief. We do 100+ of these per week across markets. We want to cut that to <10 min."
2Who benefits, and how do we measure success?"TLs get back ~3 hours/day for actual case-handling and coaching. Success = median time-to-summary ≤ 10 min within 6 weeks, with TL-acceptance rate ≥ 80% on the AI draft."
3Is the data available and reliable?"Yes — D365 case + booking + symptom tags are in the datalake (Bronze/Silver layers). No engineering dependency for MVP."
4What's the PII / Safety / regulatory angle?"PAX/DAX/MEX names + phone numbers must be redacted before prompt; case content is anonymised in the steering rules. Cited SOP article ID required for any policy answer. TL signs off on every Safety case before stakeholder send."
5Is this on the GSTF roadmap already?"Checked the GSTF roadmap — Smart Assistant doesn't ship until late 2026. This is the interim gap-filler. We'll deprecate when GSTF launches."
💡 Reality check for Cyborg builders: Question 4 is where most pilots stall when the regional Privacy / Compliance team reviews them. Answer it before building, not after. Question 5 is the Cyborg-specific seam — the strategic gap is "ideas GSTF isn't building." The Day 1 Governance & Trust module (Module 7) covers PII redaction in depth.

GenAI vs Traditional ML vs Rules vs Manual

The most expensive mistake in AI adoption is using the wrong technique. GenAI is not a hammer for every nail. Here's the decision framework finance teams actually need.

Criteria✨ Generative AI📊 Traditional ML📋 Rules / Scripts👤 Manual
Best for Narrative, summarization, classification with judgement, Q&A over documents, drafting Numeric prediction, anomaly detection, pattern recognition from structured data Deterministic calculations, lookups, threshold checks, standard procedures One-off creative judgement, novel decisions, anything irreversible
Data type Unstructured: text, PDFs, emails, contracts, transcripts Structured: tabular, time-series, labelled training data Structured: lookup tables, formulas, decision trees Whatever the human consumes
Output Generated text, summaries, classifications with reasoning, drafts Predictions, scores, classifications (no reasoning text) Exact deterministic results, pass/fail flags The judgement itself
GS example "Draft the case context summary for this IRT escalation" "Score this case 0–1 for likely DUI / true Safety based on past Safety cases" "If case has booking_id and severity = P1, auto-route to IRT queue" "Should we permanently ban this Dax based on the conflicting reports?"
Accuracy posture Good draft, human reviews High precision (95%+ on the trained task) Exact & auditable The human is the audit trail
Explainability Can cite sources (RAG); reasoning is probabilistic Feature importance, partial Fully transparent — every step traceable Whatever the human writes down
Audit defensibility Medium — needs human sign-off + source attribution Medium — model card + drift monitoring required High — every rule is in code High — the human signs

Decision shortcuts

Use GenAI when…

  • Reading long chat transcripts (40+ turns in THA / VNM / BHS) and summarising
  • Repetitive narrative work (case context, DSAT write-ups, post-call notes, first-response drafts)
  • Classification with reasoning ("severity = P1 because the Pax explicitly reported drunk driving and the ride ended <30 min ago")
  • Q&A over a body of SOPs / KB articles / Help Centre content (your SOP Lookup Top Idea)
  • Multi-step tasks involving judgement under ambiguity (what to do when the SOP doesn't quite fit)
📊

Use Traditional ML when…

  • Predicting case volume by interval (NICE IEX-style forecasting — your AI Forecasting Module Top Idea)
  • Detecting anomalies in time-series (sudden AHT spike, abnormal DSAT pattern)
  • Need high precision with measurable confidence scores (DUI / fraud risk)
  • The output is a number, score, or class — not text
  • Have ≥10K labelled examples (e.g., past PAC-tagged cases) to train on
📋

Use Rules / Scripts / n8n when…

  • Same input always produces same output (booking ID lookup, refund threshold check)
  • Threshold-based routing or escalation (case > X minutes old → escalate to TL)
  • Pass/fail SLA timers — the 30-min Safety first-response check
  • Volume too low to justify LLM spend
  • Audit trail must be 100% deterministic for regulator response
👤

Keep Human-Only when…

  • The decision is irreversible (permanent Dax ban, regulator-facing statement)
  • No precedent — novel safety incident, never-seen-before scam pattern
  • Stakes are too high for any AI failure (executive escalations, true Safety P1 calls)
  • Personal accountability is the point (TL / SPV signs off, with their name)
🤝 The hybrid is usually the answer. The strongest Cyborg solutions combine techniques. ML detects a likely-DUI signal in milliseconds → GenAI writes the case context summary explaining why. Rules enforce the 30-min Safety SLA → GenAI drafts the first-response email. Think of GenAI as the communication layer on top of precise rule engines and ML scorers — the way Lumos V2, LiveAssist, and PulseOps are designed.

Use Cases by Persona — Anchored on Your Cohort's Top 6

Tailored to the people in this room. Each persona has the highest-impact GenAI use cases — drawn from your cohort's actual submissions in the Cyborg review CSV, mapped to the 10 GS themes the GSTF curators tagged. Click your persona to expand.

Persona: GS Operations Managers + Senior Managers + Service Success Specialists across SG/MY, ID, TH, VN, KH/MM. Largest function in the cohort — 7 participants. Daily reality: case-handling AHT, FCR, escalations to TL/SPV, queue management, BPO oversight.

🎧Case Context Summarizer — IRT  ⭐ TOP IDEA
Mikko's submission. When an IRT case needs stakeholder escalation, AI reads D365 + booking + history and drafts the structured summary (Symptom · Severity · Booking · Action · Next Step). Target: 30 min → <10 min per case. Datalake-ready, multi-market.
GenAI + DatalakeDay 1 AnchorSynthesis
⚠️L1 First-Response Triage & Draft  ⭐ TOP IDEA
Angeline's submission. AI classifies incoming cases as true Safety vs downgrade and auto-drafts the first response email — protects the 30-min SLA during peak / blackout hours. Routing pattern (Day 2 Module 3).
GenAI + n8n triggerHigh ImpactRouting
🏷️PAC / Tag Auto-Labelling
Ai Minh's submission (#36). AI reads the chat / call summary and auto-fills the D365 disposition fields (Symptom L1/L2/L3) — eliminates the 30–60 sec of menu-clicking ACW per case and removes "Other" junk codes from analytics.
GenAI · Fast tierQuick Win at Scale
📞Post-Call Summary & CRM Sync (Project Steno / Echo)
Angeline (#7) + Errol (#25) + Izzul (#40). Voice-to-text → structured timeline of incident, claims, actions. Cuts ACW dramatically for IRT calls and removes the "ATO/SF documentation" compliance burden.
DL (ASR) + GenAIEndorsed

Persona: Chatbot & Digital Content Specialists / Asst. Managers, Automation Specialists, Help Centre & Automation Managers. The Cyborg builders in the room — 5 participants across ID, SG/MY, TH, PH, VN. Daily reality: ARC / chatbot flows, Stone Monkey forms, n8n workflows, KB content updates, vibecoded apps.

🔍Agent-Assist SOP Lookup  ⭐ TOP IDEA
Curated regional roll-up. RAG over the SOP / KB corpus — the "answer ONLY from this document with citation" pattern. Multi-market, no engineering dependency for MVP. Your Day 2 build target.
GenAI + RAGDay 2 AnchorAgent Guidance
🗂️Project Lumos V2 — In-House CRM
Ajidharma's submission (#22). Continuation of Lumos V1 (QMS, Abstract, Digital Statement Letter — already live). In-house Contact Centre System for Telesales / GSC Online — alternative to Salesforce / Ecentrix at SaaS-license-zero cost. Multi-market scale-out planned.
Mini-app (vibecoded)Workspace
🤖KB Retrieval & Spiel Generator
Benedictus's submission (#32). Agent pastes a Case ID or Article ID → AI retrieves the right internal KB and generates a tailored spiel for the agent to review and send. Targets ARC-handover and Help-Centre-derived ticket friction.
GenAI + RAGEndorsed
📡PH Smart Handover (ARC → Live agent)
Ariel's submission (#11). Reads the customer's chatbot conversation, generates the structured issue summary + extracted entities + missing-info prompt before the case reaches the agent. Lowers AHT by collapsing the "scroll back through the bot conversation" step.
GenAI + n8nSynthesis

Persona: Training & Quality Assurance Managers, Training Specialists, Regional TQM. 3 participants across SG/MY (Aren), PH (Angie), Enablement (Sanidwong). Daily reality: macro reviews, QA scoring, coaching write-ups, agent ramp-up, audit governance.

🎮LiveChat Sim — Training Simulator  ⭐ ALREADY SHIPPED
Sanidwong's submission (#45). New-hire and existing-LC agents practice 5 real scenarios with AI scoring against C5–C8 audit criteria. Built with Cursor — no engineering dependency, no backend, no login. Your Day 2 opening proof point.
Browser app (vibecoded)LiveTraining
📝DSAT & Sentiment AIlyzer
Sanidwong's submission (#9). Browser-based QA tool that reads LC / DG transcripts and auto-generates the four DSAT fields (Problem, Action, Root Cause, Recommendation). Targets 12–18 min → 2–3 min per case write-up. Bilingual (EN + SEA languages).
GenAI · Browser appEndorsedQA & Content
🧑‍🏫GS Level Up Simulator
Aren's submission (#15). AI plays a frustrated PAX to test agent SOP-adherence + soft skills + critical thinking. End-of-session QA scoring against the regional QA guideline. Same shape as LiveChat Sim — extends to other markets.
Browser app + GenAITraining
📋Macro Review & QC (with personalisation guardrail)
Mikko's submission (#5) — endorsed but reframed by regional TQM. AI checks macro-template usage vs personalisation, flags when an agent reused a template without personalising. Builds on Project AIONIC. Auto-coaching feedback loop.
GenAIQA & Content

Persona: WFM Business Analytics Analyst (Hemwit), Reporting & Analytics Senior Specialist (Linh), Vendor Mgmt Asst Manager (Pornpailin). 3 participants. Daily reality: forecasting, performance dashboards, OKR tracking, anomaly investigation, BPO reporting.

📈AI Forecasting Module  ⭐ TOP IDEA
Hemwit's submission (#2). End-to-end WFM forecasting platform. Consolidates D365 + NICE IEX + BPO files, applies ML for forecast at country / queue / 30-min interval granularity. Targets MAPE/WAPE accuracy + days → hours forecast cycle time. Mini-app, multi-market.
ML + GenAI · Mini-appTop IdeaForecasting
📡PulseOps — Real-Time Digital Performance
Pornpailin's submission (#37). Closes the visibility gap between Voice/Chat (live AHT) and Digital (T+1-day G-Sheet). First real-time live AHT for Digital channel. Anomaly detection + auto-coaching draft generation.
Mini-app · ML + GenAIForecasting
📊InsightHub — AI-Powered Performance Dashboard
Pornpailin's submission (#24). Centralised dashboard with smart alerts when OKRs breach thresholds, plus auto-generated business-review slides with narrative commentary. Targets 4–8 hours of slide prep → <15 min.
GenAI + Data PluginForecasting
🔬DeepDive — Bulk Ticket Root-Cause Classifier
Mikko's submission (#20). When other teams ask "how many DAX got 0 jobs after Product A launch?", AI classifies the description field across 100–200 tickets at scale. Replaces manual scrubbing. Reusable across products / markets / business questions.
GenAI · Bulk classificationForecasting / Analytics

Persona: Business Process & Knowledge Base Specialist (Errol, PH), Help Centre & Automation Manager (Nguyen Ai Minh, VN). 2 participants. The single most "RAG-shaped" personas — daily work is curating SOPs / KB articles and answering policy questions. Small bench but every Cyborg use case touches their territory.

🔍SOP Lookup with Citation  ⭐ TOP IDEA
Curated regional roll-up. Agent asks a natural-language question → RAG retrieves the right SOP article with article ID + cites the section. The flagship RAG use case for the cohort. Multi-market.
GenAI + RAGDay 2 AnchorAgent Guidance
📋Helpcenter Content Quality Auditor
Mikko's submission (#4). Daily AI audit of help-centre articles on three dimensions: title-vs-content match, completeness, user-readability. Flags Good / Needs Improvement / Critical. Cross-article consistency check across categories.
GenAIQA & Content
🤖Chatbot Transcript Handover
Sophanmay's submission (#30). When ARC hands a chat to a live agent, AI summarises what the bot already covered + extracts what info is still missing. Reduces re-asking the same questions and lowers AHT in KH/MM markets.
GenAISynthesis
🌐2nd-Language Translation Helper (KH/MM)
Seyha's submission (#43). Agents handling cases in their 2nd language often copy-paste into translation tools, losing context and flow. AI provides inline translation + tone preservation, embedded in the agent workspace.
GenAI · multilingualSynthesis / Workspace

Persona: Strategic Projects Asst. Manager (Kashen). 1 participant. The bridge persona between GS Ops and country GMs / GSTF — translates leadership questions into ops work and back.

📊OKR Roll-Up & Commentary (Project Turacos)
Kashen's submission (#1). Auto-pulling 15 OKRs × 9 markets = 135 numbers monthly is currently manual. GSTF's view: this is solvable via existing AI bot channels (Eternals' commentary). Deprioritised but the pattern is reusable.
GenAI + Slack botReviewed — Deprioritised
🧭GSTF Roadmap Awareness Layer
A reusable pattern: before scoping any Cyborg project, ask "what's already on the GSTF roadmap?" and reframe. Several CSV ideas (Macro Review, Repeated Complainant, ARC Fallback, Swift Resolve) were deprioritised because GSTF is delivering them. Strategy team owns this seam.
Process · not a buildCohort-wide
🤝Cyborg Pitch Pack Generator
Speculative — when a country team has a Cyborg idea, AI helps shape the pitch deck (problem, baseline, target, risk, GSTF-roadmap check). Strategy team curates the template.
GenAIDaily Use
📋Country GM Briefing Auto-Draft
For monthly / quarterly business reviews, auto-draft the GS section with current AHT / FCR / DSAT trend + commentary + ask. The narrative-on-numbers pattern. Pairs well with InsightHub.
GenAI + Data PluginQuick Win

Persona: Asst. Manager, Social Care (Rizza, SORT). 1 participant. Smaller bench but unique workload — public-channel social monitoring, ADA chatbot endorsements, cross-team coordination with IRT during incidents.

📡Project TRACE — SORT Case Synthesis
Rizza's submission (#6). Bridges SORT (ADA) chatbot interactions and live SORT agent assistance. Reads chat logs, generates structured summary inside the agent's workspace. Adds tagging + sentiment-engine data into a centralised DB — replaces external Google-Sheet trackers.
GenAI · Browser ext.Synthesis
🚮Project Snapshoot — Marketing Noise Auto-Closer
Rizza's submission (#38). Chrome extension that AI-classifies SORT public comments as "Marketing engagement" vs "Real support" — auto-closes the ~90% noise. Frees agents from clicking through hundreds of generic emoji / promo replies daily.
GenAI · Browser ext.High VolumeRouting
🌍Cross-Team Incident Correlation (with WFM)
From the canonical drunk-driving Safety case journey: when SORT sees an uptick in DMs about a topic, AI correlates with IRT call patterns + WFM forecast anomalies to flag emerging incidents earlier.
Hybrid (ML + GenAI)Forecasting / Routing
📤Stakeholder Escalation Auto-Draft (with IRT)
Speculative — when a public SORT post gets endorsed to a serious case, AI drafts the cross-team Slack escalation with structured context (channel, post, sentiment, suspected severity). Pairs with Seyha's #44 Slack escalation auto-fill template.
GenAI + n8nRouting
📌 Pattern across all 7 personas: Almost every flagship use case involves RAG (grounding to your SOPs / KB / Datalake) or hybrid (deterministic + GenAI). Pure GenAI alone is rare in GS — your case data, SOPs, and rules are too valuable to leave on the table. Module 9 (RAG) and the Day 2 SOP Lookup build cover both patterns. The four "build shapes" you'll see in this cohort: Valet Skill (most flagships), Mini-app (Lumos V2, AI Forecasting Module), n8n Automation (L1 Email Triage), Browser extension (Project Snapshoot, DSAT AIlyzer).

What GenAI Doesn't Do Well — Yet — for GS Support

Knowing the failure modes is more useful than knowing the wins. Here's where Cyborg builders should not reach for GenAI, and what to do instead.

Don't use GenAI for…Why it failsWhat to use instead
Calculating refund / compensation amounts Requires exact arithmetic + lookup against the market's threshold table. GenAI's "almost right" turns into agent-level fraud risk and DSAT. Threshold table (rules / SQL) for the amount; GenAI for the empathetic message wrapping it.
Reconciling settlement totals to the cent Float summation, currency conversion, FX rounding. Exactness is the point. AI miscount by one cent breaks the daily settlement. Recon engine / SQL aggregates; GenAI to explain the resulting break to the FinOps team.
Permanent Dax bans / regulator-facing statements Material consequence + irreversible. Accountability stays with a named TL / SPV / regulator-affairs lead. Audit + MAS / OJK / BNM expect it. Hybrid: AI drafts the case file + reasoning; human signs.
DUI / fraud probability scores from scratch Better solved by a trained classifier on labelled past Safety / fraud cases. GenAI doesn't know your specific incident signature. Traditional ML for the score; GenAI to explain why the score is what it is — that's the Day 2 hybrid pattern.
Novel incidents without precedent GenAI works from patterns it's seen. New scam type? New product launch? New regulator change? It will improvise — and improvisation during a live incident is risk. Human first, AI second. Once you have 10–20 examples in D365, revisit. The CSV's drunk-driving canonical journey shows the once-it's-known pattern.
Real-time D365 / booking-status feeds The model's training cutoff doesn't know today's case status, today's booking, today's PAX history. Without RAG / Plugins, it invents plausible booking IDs. Connect to D365 / datalake via Plugins (MCP). Always cite the data timestamp + Case ID.
Live Safety judgement (true Safety vs downgrade) L1 First Email Response Triage (#8) is endorsed but at L2 — AI suggests, human decides. AI alone, with no human in the loop, on a Safety case is a regulatory headline waiting to happen. Hybrid: AI classifies + drafts; human reviews every Safety case. Move to higher autonomy only after 3-month false-positive audit.

The five common failure modes — name them in your team

🎭

1. Confident hallucination

The model invents an SOP step, a refund threshold, a booking ID, a historical case. Sounds right. Catch with: citation-required steering, RAG over your real SOP corpus, "if unsure, say so".

📅

2. Stale knowledge

Model's training cutoff was months ago. Doesn't know about the policy update that shipped to KB last week. Fix with: RAG over your current SOP library, daily re-indexing.

🔢

3. Arithmetic drift

Refund amounts, FX conversion, eligibility threshold math — the model gets close but not exact. Always verify amounts; route them through a script / SQL lookup.

💬

4. Tone & style drift

Without style guidance, drafts sound generic. A Safety case stakeholder brief doesn't read like a marketing reply. Fix: persona prompts + few-shot examples + the GS H1'26 brand voice.

🪞

5. Sycophancy

Models tend to agree with the framing. "Is this case downgradable?" gets a more permissive answer than "Audit this case for true Safety triggers AND downgrade evidence." Frame for challenge.

🔄

6. Inconsistency between runs

Same case, different summaries. Acceptable for first drafts; problematic for QA-audited outputs. Fix: low temperature (0.1–0.3), RAG grounding, structured-output schemas (Symptom · Severity · Booking · Action · Next Step).

🛡️ The verification posture for GS. Every GenAI output that reaches an agent, a TL, or a customer needs three things: (1) a citation to the source SOP / Case ID, (2) a confidence statement when uncertainty is material ("If unsure, say so — escalate to TL"), (3) a path back to the input the model received. If you can't supply those three things, the output isn't ready for GS use. Module 7 (Governance & Trust) builds this in detail with PAX / DAX / MEX PII redaction examples.

Priority Matrix — Where to Start

You've got 25 endorsed use cases (across 7 GS personas) on the previous tab. You can't pursue all of them at once. The Value × Effort matrix below is how the GSTF curators selected the Category 1 Top 6 — and how you should select your market's first 1–3.

⏳ EVALUATE High value · High effort ⭐ DO FIRST High value · Low effort ⏸ AVOID Low value · High effort 📚 LEARNING Low value · Low effort (upskill) Effort → lower <----> higher ↑ Value ⭐ Case Context Summarizer (IRT) ⭐ SOP Lookup ⭐ Project Steno (call notes) ⭐ L1 Email Triage AI Forecasting Module (Mini-app) Project Lumos V2 (CRM) PulseOps (Real-Time DG Perf.) ⭐ LiveChat Sim (Training) DSAT & Sentiment AIlyzer PAC Auto-Tagging OKR / PMO roll-up (Turacos) ARC enhancement (GSTF scope)
Do first — flagship wins
Evaluate — high value, needs investment
Learning — try them to build team confidence
Avoid — wrong tool for the job

How to score your own use case

For each candidate, score Value (1–5) and Effort (1–5). Map onto the quadrants. Pick 1–3 from Do First for your first 90 days; 1 from Evaluate as a structured pilot; ignore Avoid.

Score this1 (low)5 (high)
Value — AHT / FCR / DSAT impactMarginal < 30 sec / caseMajor: 60–90 sec or 15–25%
Value — volume × marketsSingle market, < 1k cases / moRegion-wide, 100k+ cases / mo
Value — strategic fitTangential to GS H1'26 prioritiesDirect AHT / SLA / Safety improvement
Effort — data readinessAlready in datalake (Bronze / Silver)Needs new D365 fields or BPO data integration
Effort — engineering dependencyVibecodable in Cowork / CursorNeeds GTS API work + Bedrock deploy
Effort — governance / PIINo PAX/DAX/MEX names, internal-onlyCustomer-facing output, MAS / OJK exposure
Risk — GSTF roadmap collisionNot on GSTF roadmap (Cyborg-fit)GSTF Smart Assistant ships in 2 months (deprioritise)
🎯 The 90-day rule for Cyborg: Pick one "Do First" use case. Run it for 90 days in your market with a single LOB. Measure hard outcomes (median AHT, TL-acceptance rate, DSAT delta). Only after the first one ships do you start the second. Parallel pilots without a win first is the fastest way to lose credibility with your country GM and GSTF reviewer.

Implementation Phases — Realistic Timeline

Adoption isn't a switch. It's a maturity curve. Here's what good progress looks like for a finance team going from zero to running multiple agents — phased to manage risk and build confidence.

Phase 1

🌱 Quick Wins

Months 1–3

Replace repetitive narrative work with prompt templates in GrabGPT or Claude Cowork. No agents yet, no automation, no GTS / GSTF involvement. Each individual saves 2–4 hours / week.

  • Case Context Summarizer template (every market's IRT)
  • DSAT write-up template (TQA / TQM)
  • Post-call notes template (IRT)
  • SOP-question prompt with citation requirement
  • Outcome: one prompt template per person, used daily
Phase 2

🔁 Operational Use

Months 3–6

Convert templates to Skills in Claude Cowork (or Valet skills inside Grab's stack). Add steering rules for governance. Connect to one source of truth (e.g., D365 + datalake via Plugins / MCP). Country-wide adoption.

  • Saved Skills replace ad-hoc prompting in GrabGPT
  • Steering rules enforce house rules (default currency, no PAX/DAX names, citation required, escalation threshold)
  • One agent runs daily on a single use case (e.g., the SOP Lookup agent for one LOB)
  • Outcome: 1 saved Skill per use case, 1 agent in production at L1–L2 in one market
Phase 3

🚀 Transformation

Months 6–12

Multiple agents in parallel across markets. Scheduled Tasks running overnight (KB-gap reports, DSAT-trend digests). Agent review by exception only. ML + GenAI hybrids on flagship workflows. Hand-off to GTS / GSTF where productionisation makes sense.

  • 3–5 agents in production, each owning a workflow (Case Summarizer, SOP Lookup, PAC Tagger, DSAT Writer, Post-Call Notes)
  • Scheduled overnight runs surface KB gaps and weekly trends
  • ML + GenAI hybrid on highest-volume workflows (true-Safety triage, fraud-pattern explanation)
  • Outcome: team scope expanded; AI handles routine cases; agents and TLs focus on judgement + coaching

What you should be doing differently in 12 months

Today (baseline)12 months from now
Each TL writes their own IRT case context summary from scratch (~30 min)Case Summarizer Skill drafts it in <10 min; TL reviews + signs by exception
Agents search Glean / Confluence / SOP decks across 5+ toolsSOP Lookup Skill answers 60–70% of routine queries with citation; tab-switching collapses to one panel
QA analysts manually write all DSAT root-cause + recommendation fields (~12–18 min / case)DSAT AIlyzer drafts all four fields in 2–3 min; analyst reviews and adjusts
WFM forecasting cycle takes days of data prep + spreadsheet wranglingAI Forecasting Module generates country / queue / interval forecasts in hours; planner focuses on scenario judgement
SORT agents manually scan + close hundreds of marketing comments dailyProject Snapshoot auto-closes ~90% noise; agents focus on real support issues

What good adoption looks like — leading indicators

📊

Usage signals

Cyborg builders log into Cowork (or Valet) at least 3 days / week. Skills are activated daily by frontline agents. Steering rules updated when SOPs change. Activity is the leading indicator of value.

⏱️

Outcome signals

Median AHT on the targeted LOB drops 15–25% within 6 months. FCR holds or improves. DSAT trends down. Reviewer effort shifts from drafting to challenging. TLs spend more time coaching, less time scrolling D365.

🛡️

Governance signals

Audit trail captures inputs + outputs + reviewer. SOP violations flagged, not silently shipped to PAX. Quarterly review of active Skills catches stale ones. This is the difference between Cyborg adoption and Cyborg risk.

🎓 Where the rest of Day 1 + Day 2 fit. The remaining Day 1 modules give you the tools (M3–M6: how LLMs work + costs, M8–M9: prompt engineering + RAG). Module 7 (Governance) gives you the safety with PAX / DAX / MEX PII handling. Day 2 gives you the execution (build your SOP Lookup agent in Cowork). By the end of Day 2 you'll have a Phase 1 quick win running and a Phase 2 plan on paper for your country GM.