A Personal Lab for Agentic Systems

March 17, 2026

Why I'm Writing This

I've been tinkering with agentic AI systems for a while now — multi-agent architectures, tool calling, governance frameworks, observability etc. Its a lot of ground to cover but given I have a broad experience in many areas I have some level of comfort with most of the terminology and what I call the "whys". Meaning I am familiar with most reasons why you need an architecture or tool or design because I've had a lot of exposure to many of those systems. Visual Design or Technical.

I decided to start documenting the journey. Not because I've figured anything out, but because I haven't. And I think there's value in showing the thoughts and mistakes etc — the wrong turns, the over-engineering (Rube Goldberging), the "man I bit off more than I can chew" situations.

This blog series follows a fictional e-commerce project that serves as my personal learning platform for enterprise AI concepts and much of it will be built around data governance and transparency.


What This Is (and Isn't)

This is:

  • A personal lab for exploring agentic enterprise patterns
  • An attempt to organize my own thinking
  • A way to share the thought process with peers who might find it interesting
  • Exploratory analysis, not production guidance

This is not:

  • A recommended architecture
  • A best-practices guide
  • Something you should copy
  • A finished product
  • A tutorial to follow - I can't emphasize this enough - Im just telling a story this is not, and will never become, a (an?) "How To"

I'm an infrastructure engineer by trade. I work with Elasticsearch, Cloud, Terraform, Ansible Automation Platform. I'm not an ML researcher. I don't have a PhD in reinforcement learning. Though I do have a good deal of exposure to standard ML models and model choices for various use cases mostly via the Elasticsearch Machine Learning tools.


The Iceberg Metaphor

I think of this project as an Iceberg Metaphor: very simple, even charming perhaps frontend, with a large scale technical backend of what an agentic enterprise might look like beneath.

On the surface, it appears to be a basic store. You browse products, add items to a cart, check out. Nothing special.

But underneath, I'm building the infrastructure you'd need for a real enterprise AI system:

  • Multiple specialized agents with distinct roles
  • A governance layer for tracking what each agent does and why
  • Observability pipelines for understanding agent behavior
  • Reinforcement learning loops (or at least, my attempt at them)
  • Cost attribution tied to identities and
  • Prompt security controls

The store is the excuse. The infrastructure is the point.


Current State

The project is early. Really early. Here's what exists:

Architecture documentation — I've sketched out a 16-agent roster across Commerce, Product, Marketing, Intelligence, Infrastructure, and Engineering domains. Each agent has a botanical/hippie soap naming theme because... why not.

Infrastructure scaffolding — Terraform stacks for Cloud foundation resources. Not deployed yet. Still working through the separation between infrastructure-as-code and application code.

A learning framework — I've mapped different ML paradigms to different agents. Reinforcement learning for some, RLHF pipelines for others, adversarial multi-agent setups, contextual bandits. Whether I can actually implement any of this correctly is a different question.

A Godot 4 game — There's a cozy isometric visualization layer planned. The idea is that you can watch the agents work by observing a little soap shop. This is probably the most over-engineered and difficult part and I'm aware of that - hopefully Wyatt will help.


Why Transparency Matters

Every post in this series uses explicit attribution markers to show what I wrote, what Lennie-AI generated, and what emerged from collaboration.

A note on that name: I use "Lennie-AI" as a generic stand-in for whatever model I'm working with. It's a nod to Steinbeck's Of Mice and Men — a fitting reference for the relationship between human direction and AI capability. I'm not trying to advocate for any particular vendor. When model specifics matter (like comparing APIs or discussing particular features), I'll name them directly. Otherwise, it's just Lennie.

  1. I want to remember what I actually understood — Months from now, I need to know which parts reflect my genuine thinking versus which parts I nodded along to without fully grasping.

  2. Readers deserve honesty — If you're learning from this, you should know when you're reading my experience versus a model's synthesis of patterns from training data.

  3. It keeps me accountable — If I can't write the introduction myself, maybe I don't understand the project well enough yet.

You'll see <div class="authored-by-human">, <div class="authored-by-model">, and <div class="co-authored"> wrappers throughout. They're not decoration. They're the point.


What's Next

The immediate focus is getting the Cloud foundation stack deployed — the boring stuff that makes everything else possible. KMS keys, DynamoDB tables, IAM roles.

After that, I want to onboard the first real customer-facing agent and see if any of my theoretical architecture actually holds up when code starts running.

I'll document it here. The wins, the failures, the "why did I think that would work" moments.

If you're building something similar, or just curious about the mess behind enterprise AI systems, maybe you'll find something useful. Or at least entertaining.


A Note on Humility

I want to be clear: I am not an expert in this space I am learning just like everyone else.

The patterns I'm exploring come from reading papers, blog posts, and documentation — not from building production agentic systems at scale. I'm extrapolating from my infrastructure background and hoping the intuitions transfer.

They might not.

If you're an expert and you see something obviously wrong, please tell me. That's half the reason I'm writing in public.

One thing you won't find here are YouTube channels encouraging developers to chase shiny objects or "Why I ditched X for Y" - I will ditch X for Y and you can infer that from reading - but I'm not going to advocate for any one pattern, model, vendor whatever.


This is post #1 in the "Building an Agentic Enterprise" series.