Site Reliability Engineer at Catawiki (Amsterdam, Netherlands)


Posted: 7 months ago

Company Website
Position type
full time
Job source
Stack overflow
Job location

Location: We're happy to support full relocation from anywhere in the world, OR, Remote work from The Netherlands, France, Italy, Spain, UK, Belgium, Germany is also possible

What’s the job

With your expertise in software and systems engineering, you will be responsible for Catawiki Platform reliability and automation by:

  • Proactively assessing reliability aspects and addressing concerns

  • Developing Platform automation and eliminating menial tasks

  • Making sure Platform is properly instrumented and monitored

  • Identifying and establishing Service Level Indicators and Objectives

  • Sharing on-call responsibilities

  • Guiding Engineering teams on reliability best practices and approaches

Here’s Dmitrii, our Team Lead

“Hi! I am Dmitriy! I have been working for Catawiki for the third year, and I never stop being surprised that Catawiki constantly challenges our team to help businesses grow faster. Site Reliability is a young but solid team with open and passionate members. Together with the other developers, we’ve been on an exceptional journey scaling Catawiki Platform from just a few servers managed manually to a dozen Kubernetes clusters running in the Cloud powering tens of microservices. Cloud-Native technologies and automation are just a couple things that we employ on every step to take us further down the line with the greatest efficiency. We are proud of all projects we have completed and look forward to new people joining our team. “

You'll move in sync with…

As a part of a team of professionals (software/systems/data/test engineers and product/project managers) within a functional area, you’ll be making sure scalability and reliability aspects are built-in and being delivered on all steps of the development lifecycle, ensuring smooth operations.

A little bit about you

You measure everything, implement gradual changes, and accept failure as normal. By sharing ownership with developers and using the same tools you reduce organisational silos.

You leverage tooling and automation, effectively solve and communicate problems.

Next, to this it’s likely you’ll also have:

  • Experience in software and systems engineering

  • Experience being an SRE engineer (Observability, on-call rotation, incident management)

  • Practical knowledge of Kubernetes and other Cloud Native technologies (Docker/Helm/Terraform)

  • Good knowledge of relational and familiarity with NoSQL databases

  • High problem solving, analytical, and troubleshooting skills

  • Good knowledge of monitoring and alerting tools (Prometheus/AlertManager, ELK, Grafana)

  • Passion for details and a great level of pragmatism

Similar jobs

Subscribe to our daily job alerts

Sign up for our newsletter to stay up to date with new jobs posted on Profilehunt

Please confirm your email address once you subscribe.