(Senior) Site Reliability Engineer (m/f/d) - Platform & Agentic Operations

1KOMMA5˚Hamburg
FULL_TIME
Software Development
## 1KOMMA5° At **1KOMMA5°**, we pursue a clear vision: **Living on wind and sunlight forever for free**. To make this a reality, we are building the energy system of the future with Heartbeat AI. Want to be part of it?We bring together regional craftsmanship and scalable software: We don't think of solar, batteries, heat pumps, and e-mobility as isolated components, but control them as an intelligent, integrated overall system in our virtual power plant. Directly connected to the electricity market – in real time, fully automated. This way, energy is used when it is available from renewables and particularly cost-effective. By 2030, our goal is to transition 1.5 million households to renewable energies. Over 3,000 people are working towards this every day, at more than 80 locations worldwide, from Finland to Australia. **Want to take responsibility and build solutions that truly matter? Apply now and help us shape the energy world of tomorrow.** Learn about our Product & Tech team! ## Deine Position 1KOMMA5° is building Europe’s largest virtual power plant ("Heartbeat AI"). As a Senior SRE in our Platform team, you will bridge classic infrastructure with Agentic Engineering, specifically focusing on leveraging AI agents to eliminate developer friction, optimize CI/CD pipelines, and automate the resolution of code review and deployment bottlenecks. ## Tech Stack * Cloud & Infra: GCP (CloudRun, GKE), Terraform, Terramate * Agentic: Cursor * CI/CD & DevEx: GitHub Actions, Backstage * Languages: Python, GoLang, TypeScript ## Key Responsibilities Include but not limited to * Implement and improve monitoring, alerting, and incident response systems and processes to ensure high reliability for our customers and meet defined SLOs * Design, build, and maintain resilient, scalable infrastructure utilizing SRE principles and best practices * Attend post-incident reviews, detect patterns and contribute to continuous improvement efforts * Execute performance testing, analyze system bottlenecks, and formulate strategies for capacity planning to ensure our systems meet current and future demands effectively * Build systems where CI/CD test failures serve as immediate, real-time context for agents, enabling them to analyze logs, trace dependencies, and suggest or apply instant code fixes. ## Dein Profil * 6+ years in SRE, DevOps, or Platform Engineering * Strong understanding and practical application of Site Reliability Engineering (SRE) principles, methodologies, and best practices * Proficiency in programming/scripting languages such as Python, GoLang or TypeScript * Practical understanding of integrating LLMs into automated workflows. You know how to feed live system state (like a fresh CI test failure) into an agent as actionable context. * Prior experience in incident management, post-incident reviews, and implementing improvements to prevent future incidents * Ability to troubleshoot complex technical issues systematically and effectively * Good experience working with a public cloud provider, ideally Google Cloud Platform (GCP), and a solid understanding of its observability services * A proactive approach to spotting problems, areas for improvement, and performance bottlenecks * Excellent communication skills to convey technical concepts and collaborate effectively with diverse teams * Very good knowledge of spoken and written english, german is a plus * Residency in Germany Bonus points for: * Interest in climate tech industry * Prior experience with IoT applications * Having worked in a scale up environment at a company of similar size ## Benefits * You are part of an international, dynamic, and highly motivated team of people who have proven to make things happen * With your work, you accelerate the "energy transition" and hence have a direct impact on our climate * Work with and learn from other super-smart colleagues * You will enjoy direct contact with core decision-makers * You will enjoy the best chances of entering full-time in one of Europe’s most thriving scaleups * You work remotely (Germany-wide), with offices in Hamburg, Berlin or Munich * Create a healthy balance alongside your work and enjoy all the benefits of the EGYM Wellpass * Benefits and discounts are yours with Futurebens * Whether city bike or e-bike - be flexible with our job bike leasing and do something good for the environment at the same time