We are looking for a Principal Engineer – HPC Operations to lead the operational excellence of large‑scale high‑performance computing platforms supporting advanced AI and machine learning workloads.
This role combines deep technical expertise, strong operational ownership, and leadership, with a focus on reliability, performance optimisation, and automation across distributed environments.
Responsibilities: · Ensure the day‑to‑day operational stability of HPC platforms, covering compute, storage, networking, and scheduling layers. · Drive performance optimisation and capacity efficiency, maximising resource utilisation while reducing incidents and downtime. · Act as the technical owner for HPC environments, including new platform deployments and major evolutions. · Serve as the senior escalation point for complex operational incidents, leading resolution and root cause analysis. · Define and enforce scheduling, prioritisation, and workload governance policies to balance fairness, efficiency, and business needs. · Mentor and guide operations engineers, promoting best practices, automation, and continuous improvement.
Qualifications and Skills: · Strong experience operating large‑scale HPC or AI/ML platforms in production environments. · Hands‑on expertise with workload schedulers and orchestration platforms (e.g. Slurm, Kubernetes). · Solid knowledge of GPU‑based workloads, performance tuning, and resource management. · Proven experience with monitoring and observability tools to ensure system health and performance. · Advanced automation and scripting skills (e.g. Python, Bash, Infrastructure as Code). · Deep understanding of Linux systems, high‑speed networking, and distributed storage architectures.
Halian Group With over 28 years of experience, we have come to understand that innovation is the only way to provide agile, practical solutions that transform businesses and careers. Our resourcing and smart services help you to realize tomorrow’s potential. Discover the amazing things possible when you bring the right people and the right technologies together. At Halian, we recognize that diversity, equity, and inclusion (DEI) are essential to building high-performing teams for our clients. We are committed to connecting organizations with top talent from all backgrounds, ensuring that every individual feels valued, respected, and empowered to contribute their unique perspectives. We encourage applications from all qualified candidates, regardless of race, gender, disability, or any other characteristic that makes them unique. By fostering diverse and inclusive workplaces, we help our clients drive innovation, enhance collaboration, and better reflect the communities they serve.
Engineer HPC Operations in Paris, France
Bereit für den nächsten Schritt?
Melden Sie sich online an - es dauert nur 10 Minuten.