Available for Opportunities

Distinguished Cloud AI Architect & Platform Engineering Leader

Building large-scale GPU compute platforms, AI/ML infrastructure, and distributed systems at planetary scale. Expert in Slurm cluster management, GPU scheduling (NVIDIA A100/V100/DGX), Kubernetes orchestration, and multi-cloud architecture (AWS, Azure, GCP, OCI).

18+

Years Experience

$8M+

Annual Savings

10M+

Users Served

View Resume Connect on LinkedIn

73%

Cloud Cost Reduction

200+

Microservices Managed

99.97%

GPU Cluster Uptime

50+

Daily Releases

Recommendations

What Colleagues Say

Trusted by leaders across engineering, product, and executive teams at Trackonomy

"I had the opportunity to work closely with Chundu at Trackonomy, and he was someone I consistently trusted on our most critical cloud and AI initiatives. Chundu has a rare ability to combine deep technical expertise with clear, practical decision-making. Beyond the technology, Chundu is a thoughtful leader and collaborator."

Nirali Patidar

Mission-Driven TPM Leader | AI/ML

Former Atlassian, Cisco, Oracle

"Sudhakar always focused on impact. His approach in every project was to ask how it supports revenue and enhances competitive position. He always applied tech to business, knew how to rally teams from both sides, and communicate to see our projects through. The best of the best. Learned a lot from him."

Manal Yaqub

Startups @ Databricks

Client Partner

"Sudhakar made a real difference in how we built and deployed software. He streamlined our development process with automation, improved our CI/CD pipelines, and introduced solid DevSecOps practices. Whenever there was a DevOps challenge, Sudhakar was the go-to person everyone trusted."

Kedar Rajwade

Director of Engineering

Distributed Systems | Agentic AI

"Sudhakar was instrumental in establishing our comprehensive compliance and security posture. His expertise enabled Trackonomy to successfully complete our first SOC2 Type II audit. He also led an initiative that reduced our annual hosting spend by hundreds of thousands of dollars."

Keith Abrams

General Counsel

Trackonomy Systems

"Sudhakar has been instrumental in shaping a secure, scalable foundation for our AI-driven mobile stack. He designed Android CI/CD pipelines with DevSecOps tightly integrated. His work with OAuth and IAM ensured secure, seamless authentication while protecting sensitive data."

Diwakar Reddy

Technical Lead | Mobile Engineering

Industrial IoT | AI and ML

"From day one, Sudhakar impressed me with a rare combination of deep technical knowledge and thoughtful problem-solving. He architected our CI/CD pipelines and cloud infrastructure with precision — reducing deployment time by 95% and dramatically increasing system reliability."

Kasi Viswanath

Senior DevOps Engineer

Trackonomy Systems

"Sudhakar is a certified DevOps expert and an industry veteran. He developed seamless pipelines to enhance developer velocity. His emphasis on automation and constant exploration of new technologies helped make the company cloud agnostic and reduce infrastructure cost."

Abhijeet Purkar

Software and Data | CMU

Gold Medalist OU

"Sudhakar is a highly knowledgeable engineer whom I worked with for multiple 'missions impossible', addressing fires that thanks to his quick thinking and action we were able to solve. He understood end to end the needs for the customers as well as the behavior of the product."

Raymundo Alatorre

Sr. Manager Automation & IIoT

Flex

"Sudhakar is very knowledgeable in cloud infrastructure management as well as FinOps. He was a solid partner in the business, working to ensure reliable/scalable infrastructure, while also understanding and meeting the needs of others in the business."

Troy Ford

CFO

Private Equity, M&A

"I worked with Sudhakar at Trackonomy and appreciated his strong work ethic and dedication to the DevOps initiatives he helped organize. He consistently put in significant effort and took ownership of the areas he was responsible for."

Patty Steiman

Vice President Customer Success

Trackonomy | IoT Expertise

"I've been fortunate to collaborate with Sudhakar across two companies during my career. He demonstrates a strong work ethic and unwavering dedication to the DevOps initiatives he supports. He excels when working on clearly defined tasks, approaching them with determination."

Sravani Bolla

Senior DevOps Engineer

Trackonomy Systems

Nirali Patidar

Mission-Driven TPM Leader | AI/ML

Former Atlassian, Cisco, Oracle

Manal Yaqub

Startups @ Databricks

Client Partner

Kedar Rajwade

Director of Engineering

Distributed Systems | Agentic AI

Keith Abrams

General Counsel

Trackonomy Systems

Diwakar Reddy

Technical Lead | Mobile Engineering

Industrial IoT | AI and ML

Kasi Viswanath

Senior DevOps Engineer

Trackonomy Systems

View all recommendations on LinkedIn

About Me |

Building and operating infrastructure at scale for global enterprises

Professional Experience

With 18+ years of hands-on experience building and operating large-scale GPU compute platforms, AI/ML infrastructure, and distributed systems at planetary scale. Leading global teams of 13+ engineers across Infrastructure, Security, Networking, and DevOps.

Currently Distinguished AI Architect at Trackonomy, managing $15M+ budgets while delivering $8M+ in documented cost savings. Built infrastructure team from 0 → 6 engineers serving Pharma, Airlines, Government, Manufacturing, Healthcare, and IoT sectors globally across 8 countries.

Expert in Slurm-based GPU compute platforms (65 GPUs, 99.97% uptime), serverless GPU infrastructure, Databricks/Spark/Kafka real-time pipelines, and comprehensive DevSecOps with SOC2, HIPAA, FedRAMP, and HITRUST compliance.

99.97%
GPU Cluster Uptime

$15M+
Budget Managed

Professional Experience

Building and operating infrastructure at scale for global enterprises

Distinguished Cloud AI Architect / Director of Platform Engineering

Trackonomy Systems

Oct 2023 – Present

Designed Slurm-based GPU compute platform (65 GPUs, 8 nodes) with Slinky and NVIDIA BCM; achieved 99.97% uptime serving 12+ enterprise clients for AI inference and training
Reduced cloud costs 73% ($10M→$2.7M/year) through GPU utilization optimization, fair-share scheduling, and vendor consolidation; led FinOps practice with showback/chargeback models
Built team 0 → 6 engineers | Managed $15M+ budgets | SOC2, HIPAA, FedRAMP, HITRUST compliance | Secured GenAI/LLM platform against prompt injection and data exfiltration

Senior SRE / Cloud Architect — ML Infrastructure

Wipro Technologies (OSDU Data Platform)

Feb 2020 – Oct 2023

Architected OSDU R3 data platform processing exabytes of seismic data on GPU-accelerated Kubernetes (EKS/Fargate) with Spark, Kafka, Hadoop/HDFS for ExxonMobil, Chevron, BP, Shell
Created 50+ Terraform modules; implemented GitOps (ArgoCD/FluxCD), KEDA autoscaling; reduced deployment time 80%
Built observability stack with Prometheus/Grafana/ELK and GPU metrics; implemented HA architecture achieving 55% downtime reduction via capacity planning

Multi-Cloud Architect / Senior Infrastructure Engineer

Tata Consultancy Services

May 2007 – Feb 2020

Progressive 13-year career across Fortune 500 clients in healthcare, government, telecom, and financial services.

Cloud Architect — Harvard Pilgrim Health Care Jun 2018 – Feb 2020

Led cloud modernization for HIPAA/HITRUST-regulated AI/ML applications to AWS with GPU-enabled EKS clusters
Implemented Jenkins/Ansible CI/CD pipelines | DevSecOps with HashiCorp Vault, SonarQube | Managed $6M budget

Cloud Senior Engineer — CNA Insurance May 2015 – Jun 2018

Managed Kubernetes/Helm deployments on AWS; pioneered Docker/Kubernetes adoption (2013-2014)
Led cloud migrations to AWS, Azure, OpenStack | CI/CD with Jenkins/Ansible

Solutions Architect — PwC Jan 2011 – Apr 2015

Led infrastructure deployments for 20+ facility buildouts including branch offices, call centers, and data centers
Integrated Chef/Jenkins deployment pipelines | Migrated VMware VMs to AWS | Managed $15M+ budgets

Middleware Engineer — Verizon, Owens Corning May 2007 – Jan 2011

5 years deep Linux/Unix administration with WebSphere/WebLogic middleware; kernel tuning, JVM optimization
Managed large-scale production systems on bare-metal serving millions of users | 24x7 L3 operations

Technology Stack