Staff Site Reliability Engineer (SRE)

May 2

🏡 Remote – New York

Apply Now
Logo of Cribl

Cribl

Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy.

501 - 1000

Description

• Engage with teams and improve service delivery and reliability across their entire lifecycle • Measure and monitor all production systems with an eye towards availability, latency and overall system health • Seek out the cause of errors and instability in our production cloud services and drive teams towards better operational excellence • Engage with product and platform teams to improve and evolve systems by lobbying for changes that improve reliability, resilience, and observability • Help Identify and drive down toil with creative innovation and automation • On-call responsibilities

Requirements

• Extensive experience with enterprise scale continuous delivery environments • 8+ years of experience with a DevOps or SRE job title • Development with JavaScript/Node.js/TypeScript in a Linux/Mac environment • Experience with Configuration Management Tools like Terraform (preferred) or Puppet, Chef, Ansible • Experience with sustainable incident response in a blameless environment • Knowledge of cloud platforms (prefer AWS) and container + orchestration technologies • Experience with APM and Observability and related tools such as, New Relic, Splunk, CloudWatch, Prometheus, Grafana/Kibana, Sentry etc • Background in Linux Systems Engineering • Experience with Incident response related tools for instance, PagerDuty, FireHydrant, Blameless etc • Comfortable with a high level of autonomy and working with a distributed team

Benefits

• health, dental, vision, short-term disability, and life insurance • paid holidays and paid time off • a fertility treatment benefit • 401(k), equity, and eligibility for a discretionary company-wide bonus

Apply Now
Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@techjobsnewyorkcity.com