Services

I help engineering teams ship faster and sleep better. Most of my work falls into four buckets. If you’re not sure which one fits, get in touch - half my engagements start as a 30-minute conversation about what’s actually broken.

1. Production readiness for distributed systems

You have a service heading to production - or already there - and someone has to answer hard questions: what’s the error budget? what wakes someone up at 3 a.m.? what happens if a region fails? I help you answer them before a customer does.

What this looks like:

  • SLO/SLI definition tied to business outcomes (not vanity uptime numbers).
  • Failure-mode review: runbooks, incident response, rollback paths.
  • Load and chaos testing where it matters; no theatre.
  • On-call structure and alert hygiene - fewer pages, more signal.

Good fit if: you’re about to launch, just had an incident you don’t want to repeat, or are scaling past the point where ad-hoc operations work.

2. Internal Developer Platforms (IDPs)

If your developers are spending half their week on infrastructure, ticketing, and YAML, you’re paying twice for the same work. A well-designed platform turns those daily frictions into self-service.

What this looks like:

  • Golden paths from git push to production - built on what your team already runs (Kubernetes, Terraform, GitOps), not a green-field rewrite.
  • Service templates, paved-road CI/CD, and a developer portal (Backstage or similar) where it adds value.
  • Clear ownership boundaries between platform team and product teams.
  • Honest measurement: lead-time for changes, deploy frequency, change-failure rate.

Good fit if: developer productivity has plateaued, infra is a bottleneck for product delivery, or you’re trying to scale an engineering org past ~20 people.

3. Cloud architecture (GCP primary, AWS)

Design and review of cloud infrastructure for distributed event-driven systems. Multi-region, multi-account, IaC-first.

What this looks like:

  • Greenfield architecture: networking, identity, data layer, secrets, observability - designed together, not bolted on.
  • Existing-estate review: cost, security posture, blast-radius analysis. The output is a prioritized fix list, not a 60-page PowerPoint.
  • Migration planning: lift-and-shift vs. replatform vs. refactor - and the honest economics of each.

Good fit if: you’re picking a cloud, consolidating accounts, or your bill is growing faster than your traffic.

4. Kubernetes and IaC done right

Kubernetes and Terraform are not goals. They’re tools that earn their complexity only if you actually need them. I’ll tell you when you don’t.

What this looks like:

  • Kubernetes cluster setup, hardening, and upgrade strategy (or a recommendation to use a managed service and move on).
  • Terraform module design and code review - DRY without being clever, testable, with a sane state strategy.
  • GitOps with Argo CD or Flux when continuous reconciliation actually pays off.

Good fit if: your terraform plan takes 20 minutes, your clusters have drifted from your repo, or your engineers Google “kubectl debug” daily.

What I won’t do

  • Sell you Kubernetes you don’t need. A managed PaaS or a few VMs is the right answer more often than the industry admits.
  • Write a 100-page strategy doc when a one-pager would do.
  • Stay forever. The point of platform work is to leave you better off without me.

Background

Site Reliability Engineer at AxonIQ - distributed event-driven products on GCP (primary) and AWS, infrastructure as code via Terraform. Founder of InPedana (a real production Flask/SQLite/HTMX dashboard, not a slide deck) and NutriFinder. Active in the platform-engineering community.


Ready to talk? → Work With Me