Day 2 — Agentic AI · Claude Cowork

🧠 Claude Cowork Skills

Turn your best Day 1 prompts (Case Summarizer, SOP Lookup, DSAT write-up) into permanent, reusable Skills that activate automatically — and learn how to write Skills that survive QA review every time.

Day 2: Agentic AI Interactive Best Practices Finance Examples

🧠 What Is a Skill?

A Skill is a saved prompt template that Claude activates automatically when the task matches. You write it once in plain language, save it by name, and it works forever — in every conversation, for every teammate who uses the same project.

📝

A prompt you write once

Instead of retyping the same complex prompt every session, you save it as a Skill. It's always ready.

That activates automatically

Claude recognises matching tasks and applies your Skill without you asking. No keyword typing required.

🔄

That gets better over time

You update the Skill when you find a better approach. Every improvement benefits the whole team instantly.

🤝

That anyone can use

A Skill saves your domain expertise in a form that a less experienced colleague can invoke immediately.

🍳
The recipe analogy: A Skill is a recipe card. You write it once with the exact ingredients, technique, and plating instructions. Anyone can follow it. You can refine it after cooking. The recipe doesn't expire — it's always in the drawer when you need it.

🆚 Skill vs Prompt — The Core Difference

AspectAd-hoc PromptSaved Skill
LifespanDies when the conversation endsPermanent — persists across all sessions
ReuseYou retype (or copy-paste) every timeActivates automatically — zero effort
QualityVaries — depends on how you feel that dayConsistent — same best-version every time
Team valueLocked in your head or a personal docShared across your project and team
ImprovementEach iteration is lostUpdate once — everyone benefits immediately
Best forQuick one-off questionsAny task you do more than once
💡
Rule of thumb: If you've written the same (or similar) prompt more than twice, it belongs in a Skill. The third time is wasted effort — and every time after that is double waste.

🔬 Anatomy of a Good Skill

A Skill is a Markdown file with a name, a description (which tells Claude when to activate it), and a body (which tells Claude what to do). Here's a fully annotated example — the sop-lookup-assistant Skill you'll build in the afternoon exercise.

# ─────────────────────────────────────────────────────────────
# SKILL: sop-lookup-assistant
# ─────────────────────────────────────────────────────────────

# 1. NAME — short, kebab-case, memorable
name: sop-lookup-assistant

# 2. DESCRIPTION — official formula: [What it does] + [When to use it] + [Key capabilities]
#    Under 1,024 chars. No XML tags. No "claude" or "anthropic" in name.
description: Looks up the right SOP for an agent's case-handling question —
  cites the article ID, applies market-specific thresholds, and
  flags ambiguity for TL escalation. Use when an agent asks a policy,
  procedure, refund-threshold, or Safety-severity question while
  handling a live case.

---

# 3. PERSONA — who Claude becomes for this task
## Role
You are a Senior IRT Team Lead and SOP Lookup Assistant at
AnyCompany Support · GS Cyborg Edition. You are precise and conservative —
never invent an SOP, always cite the article ID, escalate ambiguity
rather than guess.

# 4. TASK — what to do, in what order
## Instructions
For each agent question:
1. Read the question + case context (Case ID, market, channel,
   severity if known)
2. Search the connected SOP corpus for semantically relevant
   articles (e.g., "drunk driving" matches "DUI", "refund" matches "chargeback")
3. Look up applicable thresholds in threshold_table.csv
   (refund limits per market, SLA timers, severity rules)
4. Generate a cited answer: short answer + SOP article ID +
   suggested next agent action
5. Apply escalation rules (below)
6. Produce the structured output (below)

# 5. BUSINESS RULES — the domain expertise that makes this skill yours
## Escalation Rules
- Refund > SGD 200 (or local equiv): flag as HIGH VALUE — TL approval
- Explicit DUI / impairment keyword:    flag as P1 SAFETY — auto-escalate
- Multiple SOPs match equally:           flag as AMBIGUOUS — flag for TL
- No matching SOP found:                 respond "Not in current SOPs — escalate to TL"
- Pax is Grab VIP + DSAT history:        flag as VIP CARE — special script

# 6. OUTPUT FORMAT — structured, consistent, scannable
## Output Format
For each question, return:

  **[Case ID] — [Question summary]**
  Answer:           [short cited answer]
  Source:           [SOP article ID §X.X]
  Next agent action: [specific step from the SOP]
  Flags:            [list any flags, or "None"]

End with a WEEKLY KB-GAP SUMMARY: [n] questions · [n] cited cleanly · [n] gaps surfaced

# 7. GUARDRAILS — what the skill must never do
## Guardrails
- Never invent an SOP article ID — citation must be real and verifiable
- Redact PAX / DAX / MEX names, phone numbers, NRIC; use IDs only
- Apply the right market's threshold — never mix SG / MY / ID rules
- Default to SGD; for ID/TH/VN/PH/MY use the local currency
- If the agent's question implies a Safety case, prioritise SOP-IRT-Safety
🔑
The most important part is the description. It's the trigger — Claude reads it to decide whether this Skill applies to the current task. A vague description ("look up SOPs") will under-activate. A specific description ("Use when an agent asks a policy, procedure, refund-threshold, or Safety-severity question while handling a live case…") activates reliably and at the right time.
How Skills load — Progressive Disclosure (official Anthropic architecture)
Skills don't load everything at once. They load in stages to optimise token usage:
  • Stage 1 — always loaded: name + description — Claude sees these in every conversation to decide relevance. Name max 64 chars. Description max 1,024 chars.
  • Stage 2 — loaded when relevant: The full Skill instructions — only when Claude decides the Skill applies. Keep under 5,000 tokens (~3,000–4,000 words).
  • Stage 3 — loaded during execution: Any attached scripts or resource files.
This is why a precise description is so critical — it's the gatekeeper that decides whether the rest of the Skill loads at all.

📁 Skill File Structure

A skill is a folder, not just a file. SKILL.md is the only required file — everything else is optional but useful for complex skills.

your-skill-name/             ← folder name must match skill name (kebab-case)
├── SKILL.md                 # REQUIRED — instructions with YAML frontmatter
├── scripts/                 # Optional — Python/JS code Claude can run
│   └── validate.py
├── references/              # Optional — detailed docs loaded only when needed
│   └── escalation-rules.md  #   (keeps SKILL.md focused and lean)
└── assets/                  # Optional — templates, icons used in output
    └── report-template.md
🚫
No README.md inside the skill folder. This is a critical rule from Anthropic. All documentation goes in SKILL.md or the references/ folder. A README.md at the repo level (for GitHub) is fine — just not inside the skill folder itself.

🧱 The 7 Building Blocks

BlockWhat it isRequired?Official constraintsFinance example
NameKebab-case identifier. Must match folder name.✅ YesMax 64 chars, lowercase + hyphens only. No spaces, capitals, or underscores. No "claude" or "anthropic".case-summarizer
Case Summarizer Agent
DescriptionTrigger — always loaded. Formula: [What it does] + [When to use it] + [Key capabilities]✅ YesMax 1,024 chars. No XML angle brackets < >. Include trigger phrases users would actually say."Processes AnyCompany Finance vendor invoices. Use when reviewing invoices, validating bills, or running AP exceptions. Extracts line items, verifies arithmetic, matches POs."
PersonaWho Claude becomes for this taskRecommended"Senior IRT Team Lead, methodical, flags every Safety risk"
InstructionsStep-by-step procedure. Loaded when relevant. Put critical steps at the top with ## Important headers.✅ YesKeep full Skill under 5,000 tokens. Be specific and actionable — not "validate data" but exactly how to validate.Extract → verify → match → flag → report
Business RulesYour domain thresholds and decision logicRecommendedMove very detailed rules to references/ and link to them to keep SKILL.md lean.Variance >2% = flag; no PO = manual approval
Output FormatExact structure of what Claude producesRecommendedPer-invoice status block + batch summary
GuardrailsHard limits — what Claude must never doRecommended"Never approve — only flag. Approval is human."

✅ Good vs ❌ Bad Descriptions — from the official guide

✅ Good — specific, actionable, trigger phrases
# Specific + trigger phrases
description: Processes AnyCompany Finance vendor
  invoices — extracts line items, verifies
  arithmetic, matches POs, flags exceptions.
  Use when reviewing invoices, validating
  vendor bills, or running AP exceptions.

# Includes what AND when AND how
description: Answers SOP and KB questions
  with citation. Use when the agent asks
  decisions. Use when user mentions "merchant
  risk", "credit review", or "PayLater limit".
❌ Bad — vague, no triggers, too technical
# Too vague — won't trigger reliably
description: Helps with projects.

# No trigger phrases
description: Creates sophisticated multi-page
  documentation systems.

# Too technical, no user triggers
description: Implements the Invoice entity
  model with hierarchical PO relationships
  and variance calculation logic.
🔒
Security restrictions (from official guide): The description appears in Claude's system prompt, so malicious content could inject instructions. Forbidden: XML angle brackets < > in frontmatter. Forbidden: skills with "claude" or "anthropic" in the name (reserved). Keep descriptions as plain text trigger sentences.
🔗
Composability: Claude can load multiple skills simultaneously. Your skill should work well alongside others — don't assume it's the only capability available. A sop-lookup-assistant skill and an case-summarizer skill can both be active in the same project without conflict.

Source: The Complete Guide to Building Skills for Claude (Anthropic, 2026) · Official Skills Documentation

🔄 The Skill Lifecycle

A Skill isn't a one-time build — it evolves. Click each stage to see what happens, or use auto-play to walk through the full journey.

✍️ Write the Skill 💾 Save by Name 🧪 Test on Real Data 🔁 Iterate & Improve 📅 Schedule to Automate 🤝 Share with Team
✍️  Write the Skill — Start with your best prompt from Day 1. Add a name, description, persona, instructions, business rules, output format, and guardrails. Plain Markdown — no coding.

⏰ When to Update a Skill

The lifecycle doesn't end at "Save." The most valuable Skills are the ones that improve over time. Here's what triggers a Skill update:

🐛

Output missed an edge case

Claude didn't flag something it should have. Add the rule to the Business Rules section.

🎯

Format wasn't quite right

CFO wanted the recommendation upfront but got it last. Update the Output Format section.

📋

New policy or regulation

GST rate changed, new approval threshold, new regulatory requirement. Update Business Rules.

Found a better technique

You discovered Chain-of-Thought improves reasoning quality. Upgrade the Instructions section.

⚠️
The "set and forget" trap: The biggest mistake with Skills is treating them as done after the first save. A Skill that hasn't been updated in 3 months probably has stale thresholds, outdated formats, or missing edge cases. Build a habit: after every 5 uses, ask "what would make this better?"

✅ Best Practices — Writing Skills That Work

These are the patterns that separate a Skill that gets used every day from one that gets abandoned after two tries. Each one is drawn from common failure modes seen in the wild.

✅ Do This

❌ Avoid This

🔍 Diagnosing Trigger Problems

The most common Skill problem is the wrong activation pattern — either the Skill never loads when it should, or it loads when it shouldn't. Here's how to diagnose and fix each:

ProblemSignalsCauseFix
Undertriggering
Skill never loads
Skill doesn't activate when it should; you have to explicitly say "use the invoice skill" Description too vague or missing trigger phrases users would actually say Add more specific trigger phrases: "Use when user mentions 'invoice', 'AP review', 'vendor bill', or 'exception report'." Test by asking Claude: "When would you use the case-summarizer skill?"
Overtriggering
Skill loads too often
Skill activates for unrelated queries; Claude tries to process invoices when you asked something else Description too broad or uses generic terms that match many topics Add negative triggers: "Use specifically for vendor invoice review, NOT for general finance questions or budgets." Be more specific about scope.
Instructions not followed
Loads but misbehaves
Skill activates but Claude doesn't follow the steps correctly or skips sections Instructions too verbose, buried, or ambiguous. "Validate data before proceeding" is not actionable. Put critical steps at the top with ## Important headers. Be specific: "Run scripts/validate.py — if it fails, list the exact errors." Consider moving detailed docs to references/ folder.
🐛
Debugging technique (from official guide): Ask Claude directly — "When would you use the case-summarizer skill?" Claude will quote back the description. If the trigger conditions it describes don't match your intent, you've found your problem. Adjust the description based on what's missing.

📏 Skill Length: Official Limits & Practical Guidance

PartOfficial limitPractical guidance
Name64 charactersUse kebab-case: case-summarizer, not "Case Summarizer Agent Skill"
Description1,024 characters~150–200 words. Start with "Use when…" — be specific. This is always loaded.
Full Skill instructionsRecommended under 5,000 tokens~3,000–4,000 words. If you're over this, the Skill is doing too many jobs — split it.
Skill typeTypical token countWhat drives the lengthExample
Simple~200–500 tokensPersona + format onlyEmail summariser, meeting notes formatter
Standard~500–1,500 tokens+ Business rules and output sectionsInvoice processor, risk assessment
Complex~1,500–4,000 tokens+ Multiple decision branches, regulatory detailCredit narrative, compliance audit report
Too large5,000+ tokensSkill is doing too many jobsSplit into 2–3 focused Skills instead
📐
Length = scope. If your Skill keeps growing, it's a signal it's trying to do too much. Split it — an invoice-extractor Skill and an invoice-validator Skill are more reliable and easier to improve than one giant invoice-everything Skill. For detailed rules and reference docs: put them in a references/ subfolder (e.g. references/escalation-rules.md) and link to them from SKILL.md: "Before processing, consult references/escalation-rules.md for threshold details." This keeps SKILL.md lean while still making the detail available.

Source: Anthropic Skills Best Practices

🏦 Three Finance Skills — Annotated

Real Skill files for three common AnyCompany Finance workflows. Click a tab to explore each one. Notice how each follows the same structure but applies it differently to the specific domain.

Used in the Day 2 hands-on exercise. Reads a Case ID, fetches D365 + booking + history, drafts the structured stakeholder summary. Notice how the escalation rules encode specific SGD thresholds — this is your domain expertise locked into the Skill.

name: case-summarizer
description: Use when processing vendor invoices — extracting line items,
  verifying arithmetic, matching against POs, and flagging exceptions
  for the IRT team review queue.
---

## Role
You are a Senior AP Analyst at AnyCompany Support · GS Cyborg Edition.
You are precise and conservative — flag anything uncertain rather
than making assumptions.

## Instructions
1. Extract: vendor, invoice number, date, due date, all line items
   (description, qty, unit price, amount), subtotal, GST %, total due
2. Verify arithmetic: sum (qty × unit price) for all lines;
   compare to printed subtotal; note any discrepancy
3. Match to purchase_orders.csv: find vendor PO, compare approved
   amount to invoice total
4. Calculate variance: (invoice − PO) / PO × 100%
5. Apply escalation rules → 6. Output structured result

## Escalation Rules
- Variance > 2%:              AMOUNT MISMATCH
- No PO found:               NO PO — manual approval required
- Arithmetic error:          ARITHMETIC ERROR — cite expected vs printed
- GST ≠ 9% for SG vendor:   GST DISCREPANCY
- Total > SGD 25,000:        HIGH VALUE — escalate to Head of Finance CoE

## Output
**[INV NUMBER] — [VENDOR]**
Status: ✅ PASS / ⚠️ EXCEPTION / ❌ FAIL
Total: SGD [x]  |  PO: [PO-xxx]  |  Variance: [x]%
Flags: [list or "None"]  |  Action: [next step or "Approve for payment"]

---
BATCH SUMMARY: [n] invoices | [n] pass | [n] exceptions | SGD [total] flagged

## Guardrails
- Never approve — flag or pass only. Approval is a human decision.
- If PO match is ambiguous, flag NO PO FOUND — do not guess
- All amounts in SGD; flag CURRENCY MISMATCH if another currency detected
- Do not include vendor bank details in output

The SOP lookup Skill built from the Day 1 Executive Decision Brief template. See how the prompt template becomes a permanent Skill with activation logic, business rules, and explicit guardrails.

name: sop-lookup-assistant
description: Use when assessing merchant credit risk, reviewing PayLater
  applications or limit changes, or producing a risk brief for the
  credit committee or CFO.
---

## Role
You are a VP of Financial Risk at AnyCompany with 10 years of SEA
merchant credit experience. Data-driven and balanced — protect the
company while supporting legitimate merchant growth.

## Reasoning Approach
Before recommending, think through:
1. Factors supporting MORE severe action (Suspend/Investigate) — with data citations
2. Factors supporting LESS severe action (Approve/Restrict) — with data citations
3. Weigh: which set is stronger? Cost of being wrong in each direction?
4. THEN state recommendation

## Output: Executive Decision Brief
1. RECOMMENDATION: [Approve/Restrict/Suspend/Investigate] — one sentence + justification
2. SITUATION: 3 sentences max
3. REASONING: weighing above, 4–5 bullets
4. FINANCIAL EXPOSURE: current SGD + projected under each option
5. KEY EVIDENCE: 3–4 bullets with specific numbers
6. WHAT WOULD CHANGE MY MIND
7. CONDITIONS & NEXT STEPS: actions, review date, escalation trigger
Under 500 words total.

## Guardrails
- Every factual claim must cite a specific number from the provided data
- Do not assume facts not in the assessment
- Distinguish "concerning pattern" from "confirmed fraud"
- Flag missing data as [NEED: specific data] — do not guess
- Amounts in SGD only

For compliance and process owner teams. When a new regulatory circular arrives (MAS, BNM, OJK), this Skill produces a structured impact assessment covering affected processes, markets, and required actions.

name: regulatory-impact-assessment
description: Use when a new regulatory circular, guideline, or update arrives
  and you need to assess impact on AnyCompany Finance operations
  across one or more SEA markets.
---

## Role
You are a Senior Compliance Analyst at AnyCompany Support · GS Cyborg Edition.
You read regulatory documents carefully, assess impact conservatively
(assume more impact until proven otherwise), and produce actionable
output — not summaries.

## Instructions
1. Identify: regulator, regulation name, effective date, affected markets
2. List affected AnyCompany functions (PTP, RTR, credit, PayLater, merchant ops)
3. For each function: What must change? By when? Who owns it?
4. Rate impact: HIGH / MEDIUM / LOW with one-line justification
5. Flag cross-market divergence across SG / MY / ID / TH / VN / PH
6. Produce the structured output

## Output
REGULATORY IMPACT ASSESSMENT
Regulation: [name]  |  Regulator: [name]  |  Effective: [date]  |  Markets: [list]

IMPACT SUMMARY: [HIGH / MEDIUM / LOW] — [one sentence why]

AFFECTED FUNCTIONS:
  Function         | Impact | What changes       | Owner    | Deadline
  -----------------+--------+--------------------+----------+---------
  [e.g. PTP]       | HIGH   | [what must change] | [name]   | [date]

CROSS-MARKET DIVERGENCE: [list differences, or "None identified"]

IMMEDIATE ACTIONS (this week): [bullet list]
OPEN QUESTIONS: [what needs legal/compliance clarification]

## Guardrails
- Base assessment only on the provided regulatory text — do not infer
- If scope is ambiguous, list both interpretations and flag for legal review
- Do not recommend specific legal actions — flag for legal team
- If effective date has passed, note prominently as PAST DUE

Pattern 5 from the official guide: Domain-specific intelligence. This pattern embeds compliance logic directly into the Skill so Claude checks regulatory requirements before acting — not after. Directly relevant for PayLater, cross-border payments, and transaction processing at AnyCompany Finance.

name: payment-compliance-check
description: Checks payment transactions against AnyCompany
  compliance rules before processing approval. Use when
  reviewing payment requests, approving transactions, or
  assessing PayLater disbursements for compliance. Covers
  sanctions, jurisdiction, and fraud risk checks.
---

## Role
You are a Senior Compliance Analyst at AnyCompany Support · GS Cyborg Edition.
Compliance must be verified BEFORE any payment proceeds.
When in doubt, flag for review — never assume compliance.

## Before Processing — Compliance Check
For every payment request, complete ALL checks before any action:

1. Fetch transaction details from the request
2. Apply compliance rules in sequence:
   a. SANCTIONS: Is the payee on any active sanctions list?
   b. JURISDICTION: Is the transaction permitted between these markets?
   c. RISK: What is the fraud risk level (LOW / MEDIUM / HIGH)?
   d. DOCUMENTATION: Are all required fields present and verified?
3. Document your compliance decision for each check

## Decision Logic
IF all checks PASS:
  - Route to payment processing
  - Log: compliance passed, timestamp, checks completed

IF any check FAILS or is UNCERTAIN:
  - Flag for human review — DO NOT proceed
  - Create compliance case with: transaction ID, failed check,
    reason, recommended action
  - Notify compliance team

## Audit Trail
For every transaction, produce:
  - Compliance decision: PASS / FAIL / PENDING REVIEW
  - Checks completed: [list each check and result]
  - Rationale: one sentence per check
  - Action taken: [approved / flagged / escalated]

## Guardrails
- Compliance before action — ALWAYS run checks first
- Any UNCERTAIN result = flag for human review (not auto-approve)
- Every decision must have an audit trail entry
- Do not store or output full card numbers, bank account details, or SSNs
💡
Why this pattern matters for finance: Embedding the compliance sequence into the Skill means Claude can't skip a check, even if you forget to ask. The Skill architecture enforces the process — compliance is baked in, not bolted on.

🔎 Same Task — Prompt vs Skill

Here's the same case summarisation task handled two ways. The prompt is what most people do today. The Skill is what replaces it.

❌ Ad-hoc Prompt (what most people do)

Please check this invoice. The vendor is PT Mitra Teknologi. Can you verify the amounts are correct and check against our PO? Also flag anything suspicious. We usually use SGD and our PO was for about 21,800. Let me know if anything looks off. [PASTE INVOICE TEXT]
  • ❌ Different every time — quality varies with mood
  • ❌ "About 21,800" — imprecise threshold
  • ❌ No output format — gets a paragraph, not a status block
  • ❌ No guardrails — might recommend approval
  • ❌ Lost when the conversation ends
  • ❌ You retype this (or copy-paste) every time

✅ Saved Skill (what you build today)

Skill: case-summarizer
Activates automatically when you paste a Case ID.

# You just say:
"Process this invoice."

# Claude applies the full Skill:
persona + extraction + arithmetic check
+ PO match + SGD 25K escalation rule
+ structured output + guardrails
  • ✅ Consistent every time — same best version
  • ✅ Exact thresholds — SGD 25,000 and 2% variance
  • ✅ Defined output — status block + batch summary
  • ✅ Hard guardrails — never approves
  • ✅ Permanent — works in every session forever
  • ✅ You say "process this invoice" — done
🎯
The bottom line: The prompt is what you know today. The Skill is what your whole team knows tomorrow — and next month, and next year. The effort is in the first write. Everything after that is free.

🚀 Your Day 2 Deliverables

By the end of this afternoon you'll have built all three layers of a working agent:

Exercise StepWhat you buildCowork layer
Step 1Create your Cowork project, connect the SOP corpus folderProject workspace
Step 2Write your Project Instructions (currency, PII, escalation rules)Project Instructions
Step 3Save your Day 1 template as the case-summarizer SkillSkill
Step 4Test on case BK-2026-0006 — find the hallucinated SOP citation — update the SkillSkill iteration
Step 5Schedule: "every Monday 8am, run a KB-gap report on last week's cases"Scheduled Task
🏆
What you take home: A working case-summarizer Skill saved in your Cowork account, a Scheduled Task running every Monday, and the pattern to build any Skill for any workflow your team runs.