Site Reliability Engineer, Cloud
Yugabyte
Date: 2 hours ago
City: Toronto, ON
Contract type: Full time
Remote
At Yugabyte, we are on a mission to become the default transactional database for enterprises building cloud-native applications. YugabyteDB is our PostgreSQL-compatible distributed database for cloud-native apps. Resilient, scalable, and flexible, it runs on any cloud and enables developers to become instantly productive using well-known APIs.We are looking for talented and driven people to join us on our ambitious mission and help us build a lasting and impactful company.The transactional database market is estimated to grow to $64B by 2025. YugabyteDB is cloud-native by design, has on-demand horizontal scalability, and supports geographical distribution of data using built-in replication. This means that we are well-positioned to meet market demand for geo-distributed, high-scale, high-performance workloads.
Join the Database Revolution at Yugabyte.
Modern applications need a cloud-native database that eliminates tradeoffs and silos. YugabyteDB retains the power and familiarity of PostgreSQL by pairing its trusted API with a precision-engineered, distributed, cloud-native architecture. Even better, it's 100% open source. Many of the world's leading enterprises are migrating from legacy RDBMSs (like Oracle, SQL Server, and DB2) to YugabyteDB, to meet their mission-critical app demands.
Role:
As a Site Reliability Engineer focused on database availability and reliability you will be using your skills to operate and automate the life cycle of the YugabyteDB DBaaS. You will design and build processes that will spin up systems and the infrastructure that manages the databases using secure, reliable, scalable and highly observable methodologies. You will be using, operating, and configuring Kubernetes environments (GKE, EKS, AKS), Java frameworks, Shell scripts, Python scripts, Terraform templates and many other cloud technologies. You will participate in the on-call rotation for 12 hours a day over 7 days, every 4-5 weeks and manage incidents on the DBaaS infrastructure coordinating support for our customers. You will learn how to diagnose problems with our database and infrastructure technology and help deliver reliable service to our customers.
We are looking for strong engineers who exemplify collaboration, teamwork, empathy and like to lead by example. We enjoy working with people who are driven and thrive in a fast-paced startup environment, and who have a strong desire to build an internet-scale, extensible control plane with strong emphasis on simplicity and user experience.
Responsibilities
As an equal opportunity employer, Yugabyte is committed to a diverse workforce. Employment decisions regarding recruitment and selection will be made without discrimination based on race, color, religion, national origin, gender, age, sexual orientation, physical or mental disability, genetic information or characteristic, gender identity and expression, veteran status, or other non-job related characteristics or other prohibited grounds specified in applicable federal, state and local laws.
To review Yugabyte's Privacy Policy please visit Yugabyte Privacy Notice.
Join the Database Revolution at Yugabyte.
Modern applications need a cloud-native database that eliminates tradeoffs and silos. YugabyteDB retains the power and familiarity of PostgreSQL by pairing its trusted API with a precision-engineered, distributed, cloud-native architecture. Even better, it's 100% open source. Many of the world's leading enterprises are migrating from legacy RDBMSs (like Oracle, SQL Server, and DB2) to YugabyteDB, to meet their mission-critical app demands.
Role:
As a Site Reliability Engineer focused on database availability and reliability you will be using your skills to operate and automate the life cycle of the YugabyteDB DBaaS. You will design and build processes that will spin up systems and the infrastructure that manages the databases using secure, reliable, scalable and highly observable methodologies. You will be using, operating, and configuring Kubernetes environments (GKE, EKS, AKS), Java frameworks, Shell scripts, Python scripts, Terraform templates and many other cloud technologies. You will participate in the on-call rotation for 12 hours a day over 7 days, every 4-5 weeks and manage incidents on the DBaaS infrastructure coordinating support for our customers. You will learn how to diagnose problems with our database and infrastructure technology and help deliver reliable service to our customers.
We are looking for strong engineers who exemplify collaboration, teamwork, empathy and like to lead by example. We enjoy working with people who are driven and thrive in a fast-paced startup environment, and who have a strong desire to build an internet-scale, extensible control plane with strong emphasis on simplicity and user experience.
Responsibilities
- Design, develop, test, debug, troubleshoot, and maintain components of the DBaaS cloud infrastructure
- Manage operational priorities of the DBaaS infrastructure
- Establish process for handling and leading response to incidents on databases or infrastructure
- Automate and manage regular maintenance operations such as upgrades etc.
- Design and build DBaaS processes for encryption, security key/password management, storage management, etc.
- Utilize SRE golden signals to analyze and optimize the DBaaS system's performance and reliability strategies
- Strong software design and implementation skills in building infrastructure frameworks
- Experience building and operating data systems for production applications, including fault tolerant designs, software lifecycles, and automation of critical operations
- Strong track record of Incident Response and Management in a managed service which is mission critical for its customers
- Experience with:
- Relational Database systems (PostgresQL preferred)
- Public cloud infrastructure (AWS, GCP, and/or Azure)
- Containerization tooling, theory and design (Docker, Kubernetes)
- Infrastructure as Code (Terraform preferred)
- Configuration Management Tooling (Ansible preferred)
- Automation Scripting (Python and Bash preferred)
- Monitoring systems (Prometheus preferred)
- Version control systems (git preferred)
- CI/CD systems (GitHub Actions preferred)
- Solid understanding of Linux systems operations and troubleshooting
- Willingness and ability to learn new languages and concepts
- 1-6 yrs of relevant experience
As an equal opportunity employer, Yugabyte is committed to a diverse workforce. Employment decisions regarding recruitment and selection will be made without discrimination based on race, color, religion, national origin, gender, age, sexual orientation, physical or mental disability, genetic information or characteristic, gender identity and expression, veteran status, or other non-job related characteristics or other prohibited grounds specified in applicable federal, state and local laws.
To review Yugabyte's Privacy Policy please visit Yugabyte Privacy Notice.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Senior Program Coordinator
Touchstone Institute,
Toronto, ON
3 hours ago
About Touchstone InstituteTouchstone Institute is the largest assessment centre of its kind in Canada, providing professional competency assessment and learning programs to support the delivery of high-quality health care.Touchstone Institute is a non-profit corporation governed by a board of directors and receives financial support from the Government of Ontario. Currently implementing a growth strategy, Touchstone Institute looks to expand its...
Full Stack Engineer, Dashboard Foundation
Stripe,
Toronto, ON
4 hours ago
Who we areAbout StripeStripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented...
Security Guard Full Time Healthcare
Paladin Security Group Ltd,
Toronto, ON
4 hours ago
Overview Paladin Security: Making the World a Safer and Friendlier Place because we CARE!Do you have superior customer service skills and a passion for helping people? Are you able to think quickly on your feet and defuse difficult situations? Your track record of handling a great deal of responsibility combined with your varied life experience and enthusiasm for a job...