(Senior) Site Reliability Engineer (m/f/d) - Platform & Agentic Operations
FULL_TIME
Software Development
## 1KOMMA5°
At **1KOMMA5°**, we pursue a clear vision: **Living on wind and sunlight forever for free**. To make this a reality, we are building the energy system of the future with Heartbeat AI. Want to be part of it?We bring together regional craftsmanship and scalable software: We don't think of solar, batteries, heat pumps, and e-mobility as isolated components, but control them as an intelligent, integrated overall system in our virtual power plant. Directly connected to the electricity market – in real time, fully automated. This way, energy is used when it is available from renewables and particularly cost-effective. By 2030, our goal is to transition 1.5 million households to renewable energies. Over 3,000 people are working towards this every day, at more than 80 locations worldwide, from Finland to Australia.
**Want to take responsibility and build solutions that truly matter? Apply now and help us shape the energy world of tomorrow.**
Learn about our Product & Tech team!
## Deine Position
1KOMMA5° is building Europe’s largest virtual power plant ("Heartbeat AI"). As a Senior SRE in our Platform team, you will bridge classic infrastructure with Agentic Engineering, specifically focusing on leveraging AI agents to eliminate developer friction, optimize CI/CD pipelines, and automate the resolution of code review and deployment bottlenecks.
## Tech Stack
* Cloud & Infra: GCP (CloudRun, GKE), Terraform, Terramate
* Agentic: Cursor
* CI/CD & DevEx: GitHub Actions, Backstage
* Languages: Python, GoLang, TypeScript
## Key Responsibilities Include but not limited to
* Implement and improve monitoring, alerting, and incident response systems and processes to ensure high reliability for our customers and meet defined SLOs
* Design, build, and maintain resilient, scalable infrastructure utilizing SRE principles and best practices
* Attend post-incident reviews, detect patterns and contribute to continuous improvement efforts
* Execute performance testing, analyze system bottlenecks, and formulate strategies for capacity planning to ensure our systems meet current and future demands effectively
* Build systems where CI/CD test failures serve as immediate, real-time context for agents, enabling them to analyze logs, trace dependencies, and suggest or apply instant code fixes.
## Dein Profil
* 6+ years in SRE, DevOps, or Platform Engineering
* Strong understanding and practical application of Site Reliability Engineering (SRE) principles, methodologies, and best practices
* Proficiency in programming/scripting languages such as Python, GoLang or TypeScript
* Practical understanding of integrating LLMs into automated workflows. You know how to feed live system state (like a fresh CI test failure) into an agent as actionable context.
* Prior experience in incident management, post-incident reviews, and implementing improvements to prevent future incidents
* Ability to troubleshoot complex technical issues systematically and effectively
* Good experience working with a public cloud provider, ideally Google Cloud Platform (GCP), and a solid understanding of its observability services
* A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
* Excellent communication skills to convey technical concepts and collaborate effectively with diverse teams
* Very good knowledge of spoken and written english, german is a plus
* Residency in Germany
Bonus points for:
* Interest in climate tech industry
* Prior experience with IoT applications
* Having worked in a scale up environment at a company of similar size
## Benefits
* You are part of an international, dynamic, and highly motivated team of people who have proven to make things happen
* With your work, you accelerate the "energy transition" and hence have a direct impact on our climate
* Work with and learn from other super-smart colleagues
* You will enjoy direct contact with core decision-makers
* You will enjoy the best chances of entering full-time in one of Europe’s most thriving scaleups
* You work remotely (Germany-wide), with offices in Hamburg, Berlin or Munich
* Create a healthy balance alongside your work and enjoy all the benefits of the EGYM Wellpass
* Benefits and discounts are yours with Futurebens
* Whether city bike or e-bike - be flexible with our job bike leasing and do something good for the environment at the same time