SRE Portfolio - Khajasaifan Mulla

About Me

I am a Lead Site Reliability Engineer (SRE) and ITSM expert with 9+ years of experience in Incident response, building, automating, and scaling production systems. Skilled in Kubernetes, Kafka, CI/CD, monitoring, and cloud-native technologies. I am passionate about bridging the gap between development and operations, ensuring uptime, performance, and reliability.

Skills

Kubernetes & Docker

Apache Kafka

CI/CD Pipelines

Cloud (AWS / GCP / Azure)

Monitoring (Dynatrace, Prometheus, Grafana)

Terraform & IaC

Linux & Scripting

Java & Selenium Automation

Projects

Kafka Streaming Platform on Kubernetes

Deployed and managed Apache Kafka clusters on Kubernetes using Strimzi Operator, enabled real-time data pipelines with monitoring and alerting.

Platform Support

Driving platform support team, Managing and maintanance of kafka kubernetes cluster by GKE upgrade, CFK upgrade, SSL key rotation and Incident & problem management.

Automation

Designed Infrastructure as Code (IaC) using Terraform to provision and manage cloud environments with monitoring and scaling. Automated may routine tasks including "Daily_Health_check report" using bash script, java and selenium and VB macros.