Aktuelle Jobs

Entdecken und Bewerben Sie sich für Jobs

AI Service Engineer (m/f/d)

FTE

Abu Dhabi, United Arab Emirates

06.10.2025

Job Title: Senior AI Services Engineer

Team: AI Services Team
Location: Remote
Reports To: AI Architect

Mission

Design, build, and operate production-grade AI microservices that power ADEO’s AI-as-a-Service (AIaaS) catalogue—enabling secure, scalable, and Arabic-first intelligence across the enterprise. Every service you ship will carry a unique Service ID, adhere to strict SLAs/SLOs, and integrate into our unified AI Gateway, registries, and observability stack. You don’t just deploy models—you deliver governed, measurable, and reusable AI capabilities.

Key Responsibilities

Architect & implement AI microservices (e.g., SRV-LLM, SRV-RAG, SRV-EMB) as Kubernetes-native, GPU-optimized workloads using C# (.NET) and Python, with schema-bound APIs (OpenAPI/JSON), rigorous testing, and zero-downtime deployment patterns (blue/green, canary).
Operationalize the full AI lifecycle: from data ingestion (PII-aware redaction, chunking) ? model training/fine-tuning (PEFT/LoRA) ? prompt engineering ? RAG orchestration ? serving ? monitoring ? drift-aware upgrades.
Deploy and manage hybrid AI inference infrastructure, including on-prem NVIDIA NIM–style microservices (vLLM/Triton on H100/A6000) with policy-based routing to Azure or vendor backends via the ADEO AI Gateway.
Enforce Arabic-first & RAG-first principles: grounded generation, citation enforcement, bilingual evaluation parity, and morphology-aware retrieval.
Integrate with platform registries: version and promote models (MLflow), prompts, and datasets with approval workflows, canary rollouts, and one-click rollback.
Embed safety & compliance by design: data-class–aware controls (RBAC, mTLS, redaction), content filters, audit logging, and runtime policy gates aligned to ADEO’s risk tiers (L0–L3).
Drive performance & cost efficiency: benchmark latency (TTFT/TBT), optimize token usage, implement caching, quantization (AWQ/GPTQ), and FinOps telemetry per Service ID.
Collaborate across the gated lifecycle (G0?G4): contribute to Service Design Docs, NFR validation, UAT sign-offs, and Operate/Improve reviews with evidence-backed KPIs (grounding ?80%, RAGAS, WER, etc.).
Support hybrid architectures: compose GenAI with classical ML (scikit-learn/XGBoost), rules engines, and external tools (SQL, GIS, KG) via MCP or function calling.

Required Qualifications

7+ years in software engineering, with 4+ years in AI/ML engineering or MLOps in production environments.
Expert in C# (.NET Minimal APIs) and Python (FastAPI, LangChain, LlamaIndex, PyTorch).
Proven experience building AI microservices on Kubernetes with GPU scheduling, service mesh (Istio/Linkerd), and infrastructure-as-code (Helm/Terraform).
Hands-on deployment of on-prem LLM inference stacks (vLLM, TGI, Triton) and replication of cloud/NVIDIA NIM patterns in private data centers.
Strong grasp of MLOps: model/prompt/dataset versioning, CI/CD for AI, shadow/canary deployments, drift detection, and SLO-driven alerting.
Experience with RAG systems: embedding models (Qwen, Jina), vector DBs (Qdrant, FAISS), rerankers (ColBERT), and retrieval evaluation (hit-rate, RAGAS).
Familiarity with evaluation-by-design: automated scoring, human-in-the-loop validation, and promotion gates.
Understanding of security & compliance: zero-trust networking, PII handling, auditability, and risk-tiered governance.
Excellent algorithmic thinking, system design skills, and a performance-first mindset.

Preferred Qualifications

Experience fine-tuning Arabic LLMs (AraBERT, Qwen-Arabic) or working with Gulf dialects.
Contributions to open-source AI tooling or MLOps platforms.
Knowledge of .NET for AI scenarios (ML.NET, gRPC interop with Python services).
Exposure to agent frameworks (LangGraph) and multi-step agentic workflows.
Experience in government, defense, or highly regulated sectors with strict data sovereignty requirements.

You’ll Thrive If You…

Believe services > projects and reuse > reinvention.
Care deeply about production reliability, user safety, and measurable impact.
Enjoy working in a platform-led, catalogue-driven environment with clear Service IDs, roadmap phases, and governance gates.
Are energized by Arabic-first AI, hybrid cloud/on-prem complexity, and building the rails others ship on.

Technology Context (ADEO Stack)

Languages: C#, Python, TypeScript
AI Serving: vLLM, Triton, NVIDIA NIM, Azure ML Endpoints
Orchestration: LangChain, LlamaIndex, LangGraph
Vector & Search: Qdrant, FAISS, Elasticsearch
Infra: Kubernetes, Istio, Helm, Terraform, NVIDIA GPU Operators
Registries: MLflow, Git-backed prompt/dataset registries
Observability: Prometheus/Grafana, OpenTelemetry, Jaeger
Security: mTLS, Vault, RBAC by Service ID, AI Gateway policy routing

Impact

Your work will directly enable Cognitive ADEO, Dossier, and dozens of enterprise use cases—through a single, governed, and scalable AI service catalogue. You’ll help turn AI from experiments into institutional capability.

Halian Group: With over 28 years of experience, we have come to understand that innovation is the only way to provide agile, practical solutions that transform businesses and careers. Our resourcing and smart services help you to realize tomorrow’s potential. Discover the amazing things possible when you bring the right people and the right technologies together.

At Halian, we recognize that diversity, equity, and inclusion (DEI) are essential to building high-performing teams for our clients. We are committed to connecting organizations with top talent from all backgrounds, ensuring that every individual feels valued, respected, and empowered to contribute their unique perspectives. We encourage applications from all qualified candidates, regardless of race, gender, disability, or any other characteristic that makes them unique. By fostering diverse and inclusive workplaces, we help our clients drive innovation, enhance collaboration, and better reflect the communities they serve.

#LI-EF1