KubeCon + CloudNativeCon
KubeCon + CloudNativeCon

Scaling GPU Clusters for ML Workloads

Best practices for managing large-scale GPU infrastructure on Kubernetes for distributed machine learning training.

Kubernetes GPU ML Infrastructure
SREcon

Achieving 99.97% Uptime at Scale

Strategies and lessons learned from maintaining high availability across multi-cloud infrastructure serving millions of users.

SRE High Availability Multi-Cloud
HashiConf

Terraform at Enterprise Scale

Managing infrastructure as code across multiple cloud providers with Terraform modules, workspaces, and automation.

Terraform IaC Enterprise
DevOps Days
DevOps Days

GitOps for Multi-Cluster Kubernetes

Implementing GitOps workflows with ArgoCD for managing deployments across multiple Kubernetes clusters.

GitOps ArgoCD CI/CD
Cloud Summit
Cloud Summit

Cost Optimization: $8M Savings Journey

Real-world strategies for reducing cloud infrastructure costs while maintaining performance and reliability.

FinOps Cost Optimization Cloud
Platform Engineering Summit
Platform Engineering Summit

Building Internal Developer Platforms

Creating self-service platforms that accelerate developer productivity while maintaining governance and security.

Platform Engineering Developer Experience Self-Service

View Full Resume

See my complete experience, certifications, and technical skills in detail.

View Resume Connect on LinkedIn