Verifiable Agents, Not Demos: A Problem-First Approach

Why I took this course

I wanted a systems lens on Artificial Intelligence (AI) agents: when to use them, how to scope autonomy, and how to operate them in production under real constraints (cost, latency, reliability, compliance).
I needed a repeatable method to go from a business problem → agent design → evaluation → rollout, not just a prompt that works on Friday and fails on Monday.
The weekly Chai & AI (tea-and-AI) sessions were the best hook: people from varied backgrounds brought real problems, unpacked buzzwords, and shared concrete patterns—immediately useful for day-to-day work.

“Ship agents like products, not prototypes.”

Course Certificate

How my approach changed

Before: I’d “vibe-code” a promising flow, tune prompts, and call it a day. After: I treat agents like products with SLOs (Service Level Objectives).

Evaluations first: define success upfront (small gold datasets, on-policy replays), then iterate.
Guardrails by design: PII (Personally Identifiable Information) boundaries, tool allowlists, budget/rate limits, safe fallbacks, and audit trails.
Operate, don’t demo: observability for latency, success rate, and cost-per-action; clear rollback paths.
Iteration with data: each change must move a metric, not just look smart.

What in the coursework made it click

The syllabus blended why with how, then forced hands-on practice.

Workflow agents & prompting (2025 view): planner–executor patterns, retries, idempotency, and skill-based prompting with APO (Automatic Prompt Optimization) treated as code with tests.
RAG (Retrieval-Augmented Generation) done right: hybrid retrieval, rerankers, source-grounded answers, and memory types (episodic/semantic/tool-state) with TTL (Time To Live) and privacy controls.
Planning & protocols: ReAct planning when uncertainty is high; standardizing tools via MCP (Model Context Protocol) so agents are portable across models.
Evaluation & guardrails: moving from offline benchmarks to on-policy evaluations; turning policy into code (allow/deny lists, thresholds, HITL (Human-in-the-Loop) gates).
Operating agents: observability, cost discipline, controlled rollouts, and clear failure taxonomies.

Homework pushed these ideas into muscle memory: build small gold datasets, write acceptance gates, and log every recommendation with sources and reason codes.

Capstone (1 week): STOCKWISE – AI Retail Intelligence

Agentic pricing & promotion planning for retailers

Problem. Pricing/replenishment across stores and e-commerce is slow and error-prone; demand shifts and competitor moves create stockouts and margin leaks.

Iteration 1 — Sense & Recommend (Workflow): Ingest sales, inventory, and competitor prices → compute price index, DOS (Days of Supply), and an elasticity proxy → return a price band + stock target. Guardrails: MAP (Minimum Advertised Price)/min-margin/max-daily-change; escalate on violation.

Iteration 2 — RAG-Enhanced Retail Assistance: Personalize recommendations with retailer history, promo calendars, market trends, and survey snippets via RAG (Retrieval-Augmented Generation); surface linked evidence. “What-if” price bands scored for revenue and stockout risk.

Iteration 3 — Autonomous Multi-Agent Autopilot: Watcher detects events → Simulator proposes actions (multi-objective) → Policy/Guard tags AUTO / APPROVE / BLOCK → Learner updates pricing; Actuator applies low-risk changes with canary + auto-rollback; HITL (Human-in-the-Loop) for high-risk moves; everything logged to an immutable audit trail.

Capstone Poster – STOCKWISE

Big thanks to Aishwarya Naresh Reganti and Kiriti Badam for a course that prioritizes evidence over hype and turns agent design into an operational discipline. The Chai & AI (tea-and-AI) sessions, in particular, brought diverse voices, clarified buzzwords, and delivered patterns I could apply the next day.

Why I took this course#

How my approach changed#

What in the coursework made it click#

Capstone (1 week): STOCKWISE – AI Retail Intelligence#

Why I took this course

How my approach changed

What in the coursework made it click

Capstone (1 week): STOCKWISE – AI Retail Intelligence