Day 2 — Agentic AI · Claude Cowork
Turn your best Day 1 prompts (Case Summarizer, SOP Lookup, DSAT write-up) into permanent, reusable Skills that activate automatically — and learn how to write Skills that survive QA review every time.
A Skill is a saved prompt template that Claude activates automatically when the task matches. You write it once in plain language, save it by name, and it works forever — in every conversation, for every teammate who uses the same project.
Instead of retyping the same complex prompt every session, you save it as a Skill. It's always ready.
Claude recognises matching tasks and applies your Skill without you asking. No keyword typing required.
You update the Skill when you find a better approach. Every improvement benefits the whole team instantly.
A Skill saves your domain expertise in a form that a less experienced colleague can invoke immediately.
| Aspect | Ad-hoc Prompt | Saved Skill |
|---|---|---|
| Lifespan | Dies when the conversation ends | Permanent — persists across all sessions |
| Reuse | You retype (or copy-paste) every time | Activates automatically — zero effort |
| Quality | Varies — depends on how you feel that day | Consistent — same best-version every time |
| Team value | Locked in your head or a personal doc | Shared across your project and team |
| Improvement | Each iteration is lost | Update once — everyone benefits immediately |
| Best for | Quick one-off questions | Any task you do more than once |
A Skill is a Markdown file with a name, a description (which tells Claude when to activate it), and a body (which tells Claude what to do). Here's a fully annotated example — the sop-lookup-assistant Skill you'll build in the afternoon exercise.
# ───────────────────────────────────────────────────────────── # SKILL: sop-lookup-assistant # ───────────────────────────────────────────────────────────── # 1. NAME — short, kebab-case, memorable name: sop-lookup-assistant # 2. DESCRIPTION — official formula: [What it does] + [When to use it] + [Key capabilities] # Under 1,024 chars. No XML tags. No "claude" or "anthropic" in name. description: Looks up the right SOP for an agent's case-handling question — cites the article ID, applies market-specific thresholds, and flags ambiguity for TL escalation. Use when an agent asks a policy, procedure, refund-threshold, or Safety-severity question while handling a live case. --- # 3. PERSONA — who Claude becomes for this task ## Role You are a Senior IRT Team Lead and SOP Lookup Assistant at AnyCompany Support · GS Cyborg Edition. You are precise and conservative — never invent an SOP, always cite the article ID, escalate ambiguity rather than guess. # 4. TASK — what to do, in what order ## Instructions For each agent question: 1. Read the question + case context (Case ID, market, channel, severity if known) 2. Search the connected SOP corpus for semantically relevant articles (e.g., "drunk driving" matches "DUI", "refund" matches "chargeback") 3. Look up applicable thresholds in threshold_table.csv (refund limits per market, SLA timers, severity rules) 4. Generate a cited answer: short answer + SOP article ID + suggested next agent action 5. Apply escalation rules (below) 6. Produce the structured output (below) # 5. BUSINESS RULES — the domain expertise that makes this skill yours ## Escalation Rules - Refund > SGD 200 (or local equiv): flag as HIGH VALUE — TL approval - Explicit DUI / impairment keyword: flag as P1 SAFETY — auto-escalate - Multiple SOPs match equally: flag as AMBIGUOUS — flag for TL - No matching SOP found: respond "Not in current SOPs — escalate to TL" - Pax is Grab VIP + DSAT history: flag as VIP CARE — special script # 6. OUTPUT FORMAT — structured, consistent, scannable ## Output Format For each question, return: **[Case ID] — [Question summary]** Answer: [short cited answer] Source: [SOP article ID §X.X] Next agent action: [specific step from the SOP] Flags: [list any flags, or "None"] End with a WEEKLY KB-GAP SUMMARY: [n] questions · [n] cited cleanly · [n] gaps surfaced # 7. GUARDRAILS — what the skill must never do ## Guardrails - Never invent an SOP article ID — citation must be real and verifiable - Redact PAX / DAX / MEX names, phone numbers, NRIC; use IDs only - Apply the right market's threshold — never mix SG / MY / ID rules - Default to SGD; for ID/TH/VN/PH/MY use the local currency - If the agent's question implies a Safety case, prioritise SOP-IRT-Safety
name + description — Claude sees these in every conversation to decide relevance. Name max 64 chars. Description max 1,024 chars.A skill is a folder, not just a file. SKILL.md is the only required file — everything else is optional but useful for complex skills.
your-skill-name/ ← folder name must match skill name (kebab-case) ├── SKILL.md # REQUIRED — instructions with YAML frontmatter ├── scripts/ # Optional — Python/JS code Claude can run │ └── validate.py ├── references/ # Optional — detailed docs loaded only when needed │ └── escalation-rules.md # (keeps SKILL.md focused and lean) └── assets/ # Optional — templates, icons used in output └── report-template.md
references/ folder. A README.md at the repo level (for GitHub) is fine — just not inside the skill folder itself.| Block | What it is | Required? | Official constraints | Finance example |
|---|---|---|---|---|
| Name | Kebab-case identifier. Must match folder name. | ✅ Yes | Max 64 chars, lowercase + hyphens only. No spaces, capitals, or underscores. No "claude" or "anthropic". | case-summarizer ✅Case Summarizer Agent ❌ |
| Description | Trigger — always loaded. Formula: [What it does] + [When to use it] + [Key capabilities] | ✅ Yes | Max 1,024 chars. No XML angle brackets < >. Include trigger phrases users would actually say. | "Processes AnyCompany Finance vendor invoices. Use when reviewing invoices, validating bills, or running AP exceptions. Extracts line items, verifies arithmetic, matches POs." |
| Persona | Who Claude becomes for this task | Recommended | — | "Senior IRT Team Lead, methodical, flags every Safety risk" |
| Instructions | Step-by-step procedure. Loaded when relevant. Put critical steps at the top with ## Important headers. | ✅ Yes | Keep full Skill under 5,000 tokens. Be specific and actionable — not "validate data" but exactly how to validate. | Extract → verify → match → flag → report |
| Business Rules | Your domain thresholds and decision logic | Recommended | Move very detailed rules to references/ and link to them to keep SKILL.md lean. | Variance >2% = flag; no PO = manual approval |
| Output Format | Exact structure of what Claude produces | Recommended | — | Per-invoice status block + batch summary |
| Guardrails | Hard limits — what Claude must never do | Recommended | — | "Never approve — only flag. Approval is human." |
# Specific + trigger phrases description: Processes AnyCompany Finance vendor invoices — extracts line items, verifies arithmetic, matches POs, flags exceptions. Use when reviewing invoices, validating vendor bills, or running AP exceptions. # Includes what AND when AND how description: Answers SOP and KB questions with citation. Use when the agent asks decisions. Use when user mentions "merchant risk", "credit review", or "PayLater limit".
# Too vague — won't trigger reliably description: Helps with projects. # No trigger phrases description: Creates sophisticated multi-page documentation systems. # Too technical, no user triggers description: Implements the Invoice entity model with hierarchical PO relationships and variance calculation logic.
< > in frontmatter. Forbidden: skills with "claude" or "anthropic" in the name (reserved). Keep descriptions as plain text trigger sentences.Source: The Complete Guide to Building Skills for Claude (Anthropic, 2026) · Official Skills Documentation
A Skill isn't a one-time build — it evolves. Click each stage to see what happens, or use auto-play to walk through the full journey.
The lifecycle doesn't end at "Save." The most valuable Skills are the ones that improve over time. Here's what triggers a Skill update:
Claude didn't flag something it should have. Add the rule to the Business Rules section.
CFO wanted the recommendation upfront but got it last. Update the Output Format section.
GST rate changed, new approval threshold, new regulatory requirement. Update Business Rules.
You discovered Chain-of-Thought improves reasoning quality. Upgrade the Instructions section.
These are the patterns that separate a Skill that gets used every day from one that gets abandoned after two tries. Each one is drawn from common failure modes seen in the wild.
The most common Skill problem is the wrong activation pattern — either the Skill never loads when it should, or it loads when it shouldn't. Here's how to diagnose and fix each:
| Problem | Signals | Cause | Fix |
|---|---|---|---|
| Undertriggering Skill never loads |
Skill doesn't activate when it should; you have to explicitly say "use the invoice skill" | Description too vague or missing trigger phrases users would actually say | Add more specific trigger phrases: "Use when user mentions 'invoice', 'AP review', 'vendor bill', or 'exception report'." Test by asking Claude: "When would you use the case-summarizer skill?" |
| Overtriggering Skill loads too often |
Skill activates for unrelated queries; Claude tries to process invoices when you asked something else | Description too broad or uses generic terms that match many topics | Add negative triggers: "Use specifically for vendor invoice review, NOT for general finance questions or budgets." Be more specific about scope. |
| Instructions not followed Loads but misbehaves |
Skill activates but Claude doesn't follow the steps correctly or skips sections | Instructions too verbose, buried, or ambiguous. "Validate data before proceeding" is not actionable. | Put critical steps at the top with ## Important headers. Be specific: "Run scripts/validate.py — if it fails, list the exact errors." Consider moving detailed docs to references/ folder. |
| Part | Official limit | Practical guidance |
|---|---|---|
| Name | 64 characters | Use kebab-case: case-summarizer, not "Case Summarizer Agent Skill" |
| Description | 1,024 characters | ~150–200 words. Start with "Use when…" — be specific. This is always loaded. |
| Full Skill instructions | Recommended under 5,000 tokens | ~3,000–4,000 words. If you're over this, the Skill is doing too many jobs — split it. |
| Skill type | Typical token count | What drives the length | Example |
|---|---|---|---|
| Simple | ~200–500 tokens | Persona + format only | Email summariser, meeting notes formatter |
| Standard | ~500–1,500 tokens | + Business rules and output sections | Invoice processor, risk assessment |
| Complex | ~1,500–4,000 tokens | + Multiple decision branches, regulatory detail | Credit narrative, compliance audit report |
| Too large | 5,000+ tokens | Skill is doing too many jobs | Split into 2–3 focused Skills instead |
invoice-extractor Skill and an invoice-validator Skill are more reliable and easier to improve than one giant invoice-everything Skill. For detailed rules and reference docs: put them in a references/ subfolder (e.g. references/escalation-rules.md) and link to them from SKILL.md: "Before processing, consult references/escalation-rules.md for threshold details." This keeps SKILL.md lean while still making the detail available.Source: Anthropic Skills Best Practices
Real Skill files for three common AnyCompany Finance workflows. Click a tab to explore each one. Notice how each follows the same structure but applies it differently to the specific domain.
Used in the Day 2 hands-on exercise. Reads a Case ID, fetches D365 + booking + history, drafts the structured stakeholder summary. Notice how the escalation rules encode specific SGD thresholds — this is your domain expertise locked into the Skill.
name: case-summarizer description: Use when processing vendor invoices — extracting line items, verifying arithmetic, matching against POs, and flagging exceptions for the IRT team review queue. --- ## Role You are a Senior AP Analyst at AnyCompany Support · GS Cyborg Edition. You are precise and conservative — flag anything uncertain rather than making assumptions. ## Instructions 1. Extract: vendor, invoice number, date, due date, all line items (description, qty, unit price, amount), subtotal, GST %, total due 2. Verify arithmetic: sum (qty × unit price) for all lines; compare to printed subtotal; note any discrepancy 3. Match to purchase_orders.csv: find vendor PO, compare approved amount to invoice total 4. Calculate variance: (invoice − PO) / PO × 100% 5. Apply escalation rules → 6. Output structured result ## Escalation Rules - Variance > 2%: AMOUNT MISMATCH - No PO found: NO PO — manual approval required - Arithmetic error: ARITHMETIC ERROR — cite expected vs printed - GST ≠ 9% for SG vendor: GST DISCREPANCY - Total > SGD 25,000: HIGH VALUE — escalate to Head of Finance CoE ## Output **[INV NUMBER] — [VENDOR]** Status: ✅ PASS / ⚠️ EXCEPTION / ❌ FAIL Total: SGD [x] | PO: [PO-xxx] | Variance: [x]% Flags: [list or "None"] | Action: [next step or "Approve for payment"] --- BATCH SUMMARY: [n] invoices | [n] pass | [n] exceptions | SGD [total] flagged ## Guardrails - Never approve — flag or pass only. Approval is a human decision. - If PO match is ambiguous, flag NO PO FOUND — do not guess - All amounts in SGD; flag CURRENCY MISMATCH if another currency detected - Do not include vendor bank details in output
The SOP lookup Skill built from the Day 1 Executive Decision Brief template. See how the prompt template becomes a permanent Skill with activation logic, business rules, and explicit guardrails.
name: sop-lookup-assistant description: Use when assessing merchant credit risk, reviewing PayLater applications or limit changes, or producing a risk brief for the credit committee or CFO. --- ## Role You are a VP of Financial Risk at AnyCompany with 10 years of SEA merchant credit experience. Data-driven and balanced — protect the company while supporting legitimate merchant growth. ## Reasoning Approach Before recommending, think through: 1. Factors supporting MORE severe action (Suspend/Investigate) — with data citations 2. Factors supporting LESS severe action (Approve/Restrict) — with data citations 3. Weigh: which set is stronger? Cost of being wrong in each direction? 4. THEN state recommendation ## Output: Executive Decision Brief 1. RECOMMENDATION: [Approve/Restrict/Suspend/Investigate] — one sentence + justification 2. SITUATION: 3 sentences max 3. REASONING: weighing above, 4–5 bullets 4. FINANCIAL EXPOSURE: current SGD + projected under each option 5. KEY EVIDENCE: 3–4 bullets with specific numbers 6. WHAT WOULD CHANGE MY MIND 7. CONDITIONS & NEXT STEPS: actions, review date, escalation trigger Under 500 words total. ## Guardrails - Every factual claim must cite a specific number from the provided data - Do not assume facts not in the assessment - Distinguish "concerning pattern" from "confirmed fraud" - Flag missing data as [NEED: specific data] — do not guess - Amounts in SGD only
For compliance and process owner teams. When a new regulatory circular arrives (MAS, BNM, OJK), this Skill produces a structured impact assessment covering affected processes, markets, and required actions.
name: regulatory-impact-assessment description: Use when a new regulatory circular, guideline, or update arrives and you need to assess impact on AnyCompany Finance operations across one or more SEA markets. --- ## Role You are a Senior Compliance Analyst at AnyCompany Support · GS Cyborg Edition. You read regulatory documents carefully, assess impact conservatively (assume more impact until proven otherwise), and produce actionable output — not summaries. ## Instructions 1. Identify: regulator, regulation name, effective date, affected markets 2. List affected AnyCompany functions (PTP, RTR, credit, PayLater, merchant ops) 3. For each function: What must change? By when? Who owns it? 4. Rate impact: HIGH / MEDIUM / LOW with one-line justification 5. Flag cross-market divergence across SG / MY / ID / TH / VN / PH 6. Produce the structured output ## Output REGULATORY IMPACT ASSESSMENT Regulation: [name] | Regulator: [name] | Effective: [date] | Markets: [list] IMPACT SUMMARY: [HIGH / MEDIUM / LOW] — [one sentence why] AFFECTED FUNCTIONS: Function | Impact | What changes | Owner | Deadline -----------------+--------+--------------------+----------+--------- [e.g. PTP] | HIGH | [what must change] | [name] | [date] CROSS-MARKET DIVERGENCE: [list differences, or "None identified"] IMMEDIATE ACTIONS (this week): [bullet list] OPEN QUESTIONS: [what needs legal/compliance clarification] ## Guardrails - Base assessment only on the provided regulatory text — do not infer - If scope is ambiguous, list both interpretations and flag for legal review - Do not recommend specific legal actions — flag for legal team - If effective date has passed, note prominently as PAST DUE
Pattern 5 from the official guide: Domain-specific intelligence. This pattern embeds compliance logic directly into the Skill so Claude checks regulatory requirements before acting — not after. Directly relevant for PayLater, cross-border payments, and transaction processing at AnyCompany Finance.
name: payment-compliance-check description: Checks payment transactions against AnyCompany compliance rules before processing approval. Use when reviewing payment requests, approving transactions, or assessing PayLater disbursements for compliance. Covers sanctions, jurisdiction, and fraud risk checks. --- ## Role You are a Senior Compliance Analyst at AnyCompany Support · GS Cyborg Edition. Compliance must be verified BEFORE any payment proceeds. When in doubt, flag for review — never assume compliance. ## Before Processing — Compliance Check For every payment request, complete ALL checks before any action: 1. Fetch transaction details from the request 2. Apply compliance rules in sequence: a. SANCTIONS: Is the payee on any active sanctions list? b. JURISDICTION: Is the transaction permitted between these markets? c. RISK: What is the fraud risk level (LOW / MEDIUM / HIGH)? d. DOCUMENTATION: Are all required fields present and verified? 3. Document your compliance decision for each check ## Decision Logic IF all checks PASS: - Route to payment processing - Log: compliance passed, timestamp, checks completed IF any check FAILS or is UNCERTAIN: - Flag for human review — DO NOT proceed - Create compliance case with: transaction ID, failed check, reason, recommended action - Notify compliance team ## Audit Trail For every transaction, produce: - Compliance decision: PASS / FAIL / PENDING REVIEW - Checks completed: [list each check and result] - Rationale: one sentence per check - Action taken: [approved / flagged / escalated] ## Guardrails - Compliance before action — ALWAYS run checks first - Any UNCERTAIN result = flag for human review (not auto-approve) - Every decision must have an audit trail entry - Do not store or output full card numbers, bank account details, or SSNs
Here's the same case summarisation task handled two ways. The prompt is what most people do today. The Skill is what replaces it.
By the end of this afternoon you'll have built all three layers of a working agent:
| Exercise Step | What you build | Cowork layer |
|---|---|---|
| Step 1 | Create your Cowork project, connect the SOP corpus folder | Project workspace |
| Step 2 | Write your Project Instructions (currency, PII, escalation rules) | Project Instructions |
| Step 3 | Save your Day 1 template as the case-summarizer Skill | Skill |
| Step 4 | Test on case BK-2026-0006 — find the hallucinated SOP citation — update the Skill | Skill iteration |
| Step 5 | Schedule: "every Monday 8am, run a KB-gap report on last week's cases" | Scheduled Task |
case-summarizer Skill saved in your Cowork account, a Scheduled Task running every Monday, and the pattern to build any Skill for any workflow your team runs.