👨‍💻 Site Reliability Engineer

Ensuring Reliability | Automating Ops | Scaling Systems | Monitoring

About Me

I am a Lead Site Reliability Engineer (SRE) and ITSM expert with 9+ years of experience in Incident response, building, automating, and scaling production systems. Skilled in Kubernetes, Kafka, CI/CD, monitoring, and cloud-native technologies. I am passionate about bridging the gap between development and operations, ensuring uptime, performance, and reliability.

Skills

Kubernetes & Docker
Apache Kafka
CI/CD Pipelines
Cloud (AWS / GCP / Azure)
Monitoring (Dynatrace, Prometheus, Grafana)
Terraform & IaC
Linux & Scripting
Java & Selenium Automation

Projects

Kafka Streaming Platform on Kubernetes

Deployed and managed Apache Kafka clusters on Kubernetes using Strimzi Operator, enabled real-time data pipelines with monitoring and alerting.

Platform Support

Driving platform support team, Managing and maintanance of kafka kubernetes cluster by GKE upgrade, CFK upgrade, SSL key rotation and Incident & problem management.

Automation

Designed Infrastructure as Code (IaC) using Terraform to provision and manage cloud environments with monitoring and scaling. Automated may routine tasks including "Daily_Health_check report" using bash script, java and selenium and VB macros.

Contact

Connect with me via: