As a Site Reliability Engineer focused on database availability and reliability you will be using your skills to operate and automate the life cycle of the YugabyteDB DBaaS. You will design and build processes that will spin up systems and the infrastructure that manages the databases using secure, reliable, scalable and highly observable methodologies. You will be using, operating, and configuring Kubernetes environments (GKE, EKS, AKS), Java frameworks, Shell scripts, Python scripts, Terraform templates and many other cloud technologies. You will participate in the on-call rotation for 12 hours a day over 7 days, every 4-5 weeks and manage incidents on the DBaaS infrastructure coordinating support for our customers. You will learn how to diagnose problems with our database and infrastructure technology and help deliver reliable service to our customers.
We are looking for strong engineers who exemplify collaboration, teamwork, empathy and like to lead by example. We enjoy working with people who are driven and thrive in a fast-paced startup environment, and who have a strong desire to build an internet-scale, extensible control plane with strong emphasis on simplicity and user experience.Responsibilities
Requirements
Interview Process: Health and safety remain a top priority for all of our roles. As such, all Yugabyte interviews are held virtually, so we can all continue doing our part with social distancing and containment efforts. Although we are based in Silicon Valley, we hire exceptional folks wherever they are! Our process usually lasts 2-3 weeks, and consists of a phone screen, Zoom interviews including senior leaders.
Compensation and Benefits: We are committed to the principle of equal pay for equal work. The cash compensation for this role is market-competitive, ranging from $150,000 to $200,000. Additional benefits include equity options, comprehensive health plans, retirement benefits, and unlimited paid time off (PTO).
Click the checkbox next to the jobs that you are interested in.
Click the checkbox next to the jobs that you are interested in.
Emergency Management Skill
Quality Control/Testing Skill
Sr Site Reliability Engineer (Cortex XDR Cloud)
PaloAlto Networks, Santa Clara, CA
Principal Software Engineer, Site Reliability (XDR Cloud)
PaloAlto Networks, Santa Clara, CA