Professional Summary
Senior Site Reliability Engineer with 15+ years of experience driving operational excellence across Fortune 500 companies including Apple, GoDaddy, and 20th Century Fox. Expert in cloud infrastructure automation, team leadership,
and large-scale system optimization. Proven track record of reducing downtime by 90%, leading successful infrastructure migrations, and mentoring high-performing engineering teams. Specialized in Python development, automation, and
implementing monitoring solutions that serve millions of users.
Core Skills
- Monitoring: Prometheus, Grafana, Datadog, Icinga
- Cloud Platforms: AWS, OpenStack, CloudStack
- Development: Python, Django, Golang, PHP
- Automation: Ansible, Chef, Terraform
- Containerization: Docker, Kubernetes
- Team Leadership & Performance Management
Professional Experience
- Enhanced infrastructure monitoring for CloudStack by implementing Apple's internal monitoring tools, improving system visibility and incident response times by 40%.
- Developed comprehensive documentation and automated onboarding processes, reducing new engineer ramp-up time from 2 weeks to 3 days.
- Designed and implemented end-to-end testing framework for cloud components, achieving 99.9% deployment success rate and eliminating production incidents.
Supervisor, Site Reliability Engineering
July 2022 – June 2024
- Led migration of Domains monitoring infrastructure to Prometheus stack, improving system visibility by 60% and reducing alert fatigue.
- Supervised team of 5 SREs, conducting performance reviews and facilitating professional development with 100% retention rate.
- Managed sprint planning, daily stand-ups, and Jira board for effective project delivery, achieving 95% on-time completion rate.
Site Reliability Engineer II
July 2020 – July 2022
- Improved infrastructure performance by analyzing system metrics and implementing optimization strategies, reducing response times by 35%.
- Automated repetitive tasks for Production Engineering team using Python and Ansible, increasing team efficiency by 50%.
- Participated in 24/7 on-call rotation ensuring 99.95% uptime and rapid incident response within SLA targets.
- Developed automated testing environments for OpenStack infrastructure, improving deployment reliability by 85%.
- Participated in Agile sprint planning and conducted code reviews for high-quality software delivery.
- Enhanced server reliability through infrastructure automation and monitoring solutions, reducing downtime by 60%.
- Led migration of critical payment processing code from Python 2 to Python 3, ensuring compatibility before EOL while maintaining 99.99% service availability for fraud detection systems.
- Upgraded fleet of Ubuntu servers to newer LTS versions, implementing security patches and enhancements while ensuring strict PCI compliance for financial transaction processing.
- Established Python best practices through technical workshops and code reviews, mentoring junior engineers and improving team code quality by standardizing development patterns.
- Managed hybrid cloud infrastructure spanning AWS and on-premises data centers, implementing automation with Chef and Ansible that reduced deployment times by 65%.
- Optimized AWS resource utilization and implemented RI planning strategy, reducing cloud infrastructure costs by 30% while maintaining performance.
- Established improved monitoring systems with Prometheus/Grafana, enhancing visibility across services and reducing incident response times by 40%.
- 20th Century Fox: Managed Linux server infrastructure, automated deployments with Terraform, and established CI/CD pipelines for media processing systems.
- Mirantis, HP Helion: Maintained OpenStack CI/CD pipelines, developed Python automation tools, and enhanced incident response processes, improving response times by 50%.
- HostGator: Administered thousands of Linux servers, resolved technical issues, and provided customer support for 50+ daily clients.
Military Experience
- Deployed to Iraq (2010–2011) in support of Operation Iraqi Freedom and Operation New Dawn.
- Honorably discharged with multiple commendations including Army Commendation Medal and Combat Action Badge.