Aktuelle Jobs
Entdecken und Bewerben Sie sich für Jobs
Infrastructure Operations Specialist (Remote) (m/f/d)
Job Title: Infrastructure Operations Specialist (Remote)
Duration: 6 Months
Start Date: August 3rd
Location: Remote (aligned with MENA Standard Time)
Workload: Full-time, 5 days/week (Sunday–Thursday preferred; Monday–Friday acceptable)
Language Requirement: Fluent English
Job Overview:
We are seeking an experienced Infrastructure Operations Specialist to provide remote support, maintenance, and optimization for GPU-based server environments. The ideal candidate brings hands-on experience in cluster management, Ethernet networking, and working with CSP customers.
Responsibilities:
Administration & Operations
-
Monitor, review, and manage server infrastructure
-
Handle user requests and access management
-
Analyze log files and produce regular reports
Cluster Management
-
Maintain and operate Base Command Manager for GPU clusters
-
Perform daily operational tasks
Change Management
-
Support firmware/software updates and change implementations
-
Document and implement IT procedures and policy changes
-
Manage migrations and related reporting
Post-Implementation & Knowledge Transfer
-
Plan transition activities with deployment teams
-
Configure additional host and network elements post-deployment
-
Transfer knowledge on tools, procedures, and best practices
-
Recommend enhancements and upgrades
Problem Management
-
Isolate and troubleshoot incidents
-
Coordinate service incidents and open support tickets
-
Contribute to root cause analyses
Service Optimization
-
Recommend improvements for process efficiency
-
Share best practices from similar projects
-
Provide performance tuning input
Evaluation & Advisory
-
Review IT processes (incident, change, performance, etc.)
-
Collaborate on documentation and procedural updates
-
Support upskilling of internal teams
Requirements:
-
Experience with B200 GPU XE9680L server maintenance
-
Strong background in Ethernet networking
-
Experience supporting Cloud Solution Provider (CSP) environments
-
Solid understanding of infrastructure operations and technical leadership
-
Strong communication skills and ability to deliver structured knowledge transfer
#LI-AA2