Website REQUEST TECHNOLOGY
Site Reliability Engineer
Salary: $130k-$145k + bonus
Location: Hybrid role in either location
Chicago, IL/Dallas, TX
Qualifications
- Bachelor’s or Master’s Degrees in Computer Science, Information Systems or other related field. Or equivalent work experience.
- Minimum of 5-8 years of experience in Site Reliability Engineering/DevOps
- Experience managing infrastructure in public cloud environments like AWS (preferred), Azure or GCP.
- Experience providing visibility using monitoring and alerting tools like Splunk, SignalFx, AppDynamics, Datadog, ELK, Prometheus or Grafana.
- Experience with container orchestration systems like Kubernetes, Mesos, Docker Swarm or Rancher.
- Experience with infrastructure as code and configuration management tools like Terraform, Ansible, Puppet or Chef.
- Programming/Scripting experience in languages like Java, Bash, Python or Go.
- Experience with distributed messaging systems like Kafka, RabbitMQ, or ActiveMQ.
- Experience with using Continuous Integration and Continuous Delivery (CI/CD) tools like Jenkins, Travis, Harness, Spinnaker, Appveyor, CodeBuild or CodePipeline.
Responsibilities
- Collaborate with development, operations and infrastructure teams to ensure availability of services, and to work through implementation issues.
- Develop automation for incident response and to prevent problem recurrence
- Create and enhance runbooks to respond to service outages or degradations
- Assess the production readiness of services
- Define and track operational metrics for production performance, reliability, scalability and availability
- Architect, develop and maintain shared services and tools to improve reliability and reduce toil across the organization
- Contribute to the team’s continuous improvement through research, retrospectives, discussion groups and code reviews
- Provide leadership within the team by guiding and mentoring junior members, and preparing stories for the sprint backlog
To apply for this job please visit www.jobvertise.com.