Senior Site Reliability Engineer

May 8

🏡 Remote – New York

Apply Now

Loading...

Ava Labs

Ava Labs redefines the way people create value with Web3

Internet • Cryptocurrency • Decentralized Finance • Crypto • Blockchain

51 - 200

Description

• Develop and optimize highly reliable and scalable infrastructure focused on SRE principles • Implement and maintain monitoring, logging, and tracing tools to gain insights into service behavior and health • Uphold SLOs (Service Level Objectives), SLIs (Service Level Indicators), and error budgets for critical systems • Enhance the reliability and resiliency of critical systems by identifying single points of failure and implementing best practices • Collaborate with software developers to build reliability and performance into applications from inception • Automate and streamline incident management processes to minimize service disruption and improve response times • Participate in on-call rotations, ensuring quick restoration of services and fostering a blameless post-mortem culture • Foster a continuous improvement mindset by analyzing and learning from incidents and implementing preventive measures • Leverage cloud technologies and IaC tools to ensure scalability and repeatability • Advocate for best practices in reliability, security, and maintainability within the team

Requirements

• BS in Computer Science or related field • 5+ years of experience as an SRE, DevOps, or Cloud Engineer • Strong grasp of SRE principles, including error budgets, SLOs, and SLIs • Cloud networking and orchestration with AWS (EKS, ECS, VPC, S3, ELB) • Strong Kubernetes experience with Docker or RKT containerization • Proficiency in Infrastructure as Code (IaC) using tools such as Terraform, Terragrunt, and Ansible • Experience with monitoring and observability tools like Prometheus, Grafana, or ELK Stack • Building and maintaining CI/CD pipelines with GitHub Actions (preferred), Jenkins, Travis CI, Circle CI • Experience with automation and configuration management using Ansible, Puppet or Chef • Experience with Linux-based infrastructures. (Ubuntu preferred) • Experience with scripting languages and the creation of scripts. (Python and GoLang preferred) • Working knowledge of decentralized architecture design patterns and distributed systems

Benefits

• 11 paid holidays • Generous accrued time off increasing with years of service • Generous paid sick time • Annual day of service

Apply Now
Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@techjobsnewyorkcity.com