Senior Platform Engineer
Moonpay • United Kingdom • Engineering
What you’ll do
In the short term we need to increase the resiliency and reliability of our current PaaS solution with things such as:
-
Improving the maintainability of our infrastructure as code
-
Building dashboards, monitoring & alerting mechanisms with Datadog
-
Load testing and performance tuning our production services
-
Lifecycling and maintenance of our Kubernetes clusters
In the medium to long term you’ll get to:
-
Implement new and shiny technologies on top of Kubernetes as you see fit to ensure our tech can scale with the business.
-
Develop and integrate solutions with a bias for automation in order to improve and maintain reliability across the production estate and make recovery easier.
-
Design and track metrics for site uptime and performance ensuring high levels of visibility are maintained.
-
Own the deployment pipelines and continuously improve our monitoring and alerting capabilities.
-
Collaborate closely with all other engineering functions to provide timely feedback from our environments.
-
Support Engineering on their journey to deliver better software, faster and more safely (think “It’s OK to deploy on Fridays” 😎).
About you
-
You have strong systems administration skills, know the difference between a container and a virtual machine, and know your way around a Linux terminal
-
You have platform engineering/SRE experience at leading startups or fast growing tech companies
-
You have either had experience with some of our tech stack or are confident you can cross train and up skill quickly
-
You have experience working in a regulated industry
-
You are confident working with and guiding developers on monitoring and logging of complex systems at scale
-
You have worked on complex projects
-
You reflexively reach for AI agents to assist in researching and solving your problems
-
You can work collaboratively with different teams i.e. Security, Data, Engineering
-
You want to forge and own MoonPays reliability & recovery processes
-
You’ve got at least a basic understanding of complex reliability structures, theories, principles, and best practices
-
You have worked with JavaScript codebases and frameworks e.g Typescript, Node.JS and React
Current Tech Stack
-
Typescript as our programming language of choice
-
Node.js as our backend platform