Site Reliability Engineer, Cloud

YugabyteDB

Sunnyvale, CA Remote Full Time

POSTED ON 5/13/2024

Job Posting for Site Reliability Engineer, Cloud at YugabyteDB

The Role

As a Site Reliability Engineer focused on database availability and reliability you will be using your skills to operate and automate the life cycle of the YugabyteDB DBaaS. You will design and build processes that will spin up systems and the infrastructure that manages the databases using secure, reliable, scalable and highly observable methodologies. You will be using, operating, and configuring Kubernetes environments (GKE, EKS, AKS), Java frameworks, Shell scripts, Python scripts, Terraform templates and many other cloud technologies. You will participate in the on-call rotation for 12 hours a day over 7 days, every 4-5 weeks and manage incidents on the DBaaS infrastructure coordinating support for our customers. You will learn how to diagnose problems with our database and infrastructure technology and help deliver reliable service to our customers.

We are looking for strong engineers who exemplify collaboration, teamwork, empathy and like to lead by example. We enjoy working with people who are driven and thrive in a fast-paced startup environment, and who have a strong desire to build an internet-scale, extensible control plane with strong emphasis on simplicity and user experience.

Responsibilities

Design, develop, test, debug, troubleshoot, and maintain components of the DBaaS cloud infrastructure
Manage operational priorities of the DBaaS infrastructure
Establish process for handling and leading response to incidents on databases or infrastructure
Automate and manage regular maintenance operations such as upgrades etc.
Design and build DBaaS processes for encryption, security key/password management, storage management, etc.
Utilize SRE golden signals to analyze and optimize the DBaaS system's performance and reliability strategies

Requirements

Strong software design and implementation skills in building infrastructure frameworks
Experience building and operating data systems for production applications, including fault tolerant designs, software lifecycles, and automation of critical operations
Strong track record of Incident Response and Management in a managed service which is mission critical for its customers
Experience with:
- Relational Database systems (PostgresQL preferred)
- Public cloud infrastructure (AWS, GCP, and/or Azure)
- Containerization tooling, theory and design (Docker, Kubernetes)
- Infrastructure as Code (Terraform preferred)
- Configuration Management Tooling (Ansible preferred)
- Automation Scripting (Python and Bash preferred)
- Monitoring systems (Prometheus preferred)
- Version control systems (git preferred)
- CI/CD systems (GitHub Actions preferred)
Solid understanding of Linux systems operations and troubleshooting
Willingness and ability to learn new languages and concepts

Interview Process: Health and safety remain a top priority for all of our roles. As such, all Yugabyte interviews are held virtually, so we can all continue doing our part with social distancing and containment efforts. Although we are based in Silicon Valley, we hire exceptional folks wherever they are! Our process usually lasts 2-3 weeks, and consists of a phone screen, Zoom interviews including senior leaders.

Compensation and Benefits: We are committed to the principle of equal pay for equal work. The cash compensation for this role is market-competitive, ranging from $150,000 to $200,000. Additional benefits include equity options, comprehensive health plans, retirement benefits, and unlimited paid time off (PTO).

Equal Employment Opportunity Statement: As an equal opportunity employer, Yugabyte is committed to a diverse workforce. Employment decisions regarding recruitment and selection will be made without discrimination based on race, color, religion, national origin, gender, age, sexual orientation, physical or mental disability, genetic information or characteristic, gender identity and expression, veteran status, or other non-job related characteristics or other prohibited grounds specified in applicable federal, state and local laws.

Apply for this job

Receive alerts for other Site Reliability Engineer, Cloud job openings

Salary.com Estimation for Site Reliability Engineer, Cloud in Sunnyvale, CA

$115,147 - $136,804

For Employer

Looking to price a job for your company?

Price This Job

For Employer

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning

Looking for Accurate Salary Data?

Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

Sign up to receive alerts about other jobs that are on the Site Reliability Engineer, Cloud career path.

Click the checkbox next to the jobs that you are interested in.

Reliability Engineer II

Income Estimation:

$91,750 - $109,007

Corrosion Engineer II

Income Estimation:

$115,912 - $150,451

Reliability Engineer III

Income Estimation:

$112,564 - $133,311

Sign up to receive alerts about other jobs with skills like those required for the Site Reliability Engineer, Cloud.

Click the checkbox next to the jobs that you are interested in.

Emergency Management Skill
- Police Sergeant
  
  Income Estimation: $68,091 - $94,272
- Surveillance Operations Manager
  
  Income Estimation: $72,185 - $96,921
Quality Control/Testing Skill
- Facilities Planner II
  
  Income Estimation: $72,021 - $97,480
- Instrument Technician III
  
  Income Estimation: $72,205 - $97,376

Employees: Get a Salary Increase

For Employer

View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Not the job you're looking for? Here are some other Site Reliability Engineer, Cloud jobs in the Sunnyvale, CA area that may be a better fit.

Sr Site Reliability Engineer (Cortex XDR Cloud)

PaloAlto Networks, Santa Clara, CA

Principal Software Engineer, Site Reliability (XDR Cloud)

PaloAlto Networks, Santa Clara, CA

View More Jobs

Site Reliability Engineer, Cloud

Job Posting for Site Reliability Engineer, Cloud at YugabyteDB

Sign up to receive alerts about other jobs that are on the Site Reliability Engineer, Cloud career path.

Sign up to receive alerts about other jobs with skills like those required for the Site Reliability Engineer, Cloud.

Not the job you're looking for? Here are some other Site Reliability Engineer, Cloud jobs in the Sunnyvale, CA area that may be a better fit.

We don't have any other Site Reliability Engineer, Cloud jobs in the Sunnyvale, CA area right now.