Responsibilities:
- 
Co‑create AI vision, KPI tree, and prioritized use‑case portfolio with business leaders. 
- 
Translate strategy to a delivery roadmap and budget, with explicit risk and dependency plans. 
- 
Delivery leadership 
- 
Lead cross‑functional pods (data, platform, app, safety, SRE) from discovery through production. 
- 
Design A/B and canary rollout strategies; enforce guardrails and incident playbooks. 
- 
AI system engineering 
- 
Architect and guide implementation of LLM/RAG/agent solutions, including retrieval quality, prompt/policy engineering, guardrails, and evaluation harnesses. 
- 
Drive observability (tracing, safety counters, cost telemetry) and SLO compliance. 
- 
Governance and security 
- 
Stand up policy‑as‑code, model/prompt versioning, access controls, data residency, and vendor risk assessments. 
- 
Chair or collaborate with the governance board; run review gates. 
- 
Stakeholder engagement 
- 
Communicate simply and often; convert ambiguity into decisions; manage expectations. 
- 
Run demos, evidence‑based decisions, and post‑incident reviews. 
- 
Talent enablement 
- 
Mentor teams on AIOps/SRE practices; cultivate champions; reduce burden through automation. 
Must‑have skills:
- 
Leadership and ownership 
- 
Operates with high autonomy, bias to action, and accountability for outcomes. 
- 
Proven ability to align executives and guide cross‑functional teams without formal authority. 
- 
Communication and influence 
- 
Excellent written and verbal communication and meeting facilitation skills. 
- 
Translates technical topics (LLMs, safety, SOs) into business terms and tradeoffs. 
- 
Product and delivery thinking 
- 
Evidence‑driven decisions; comfort with A/B testing, canary rollouts, and ROI models. 
- 
Governance and security mindset 
- 
Practical understanding of data governance, privacy, and AI safety guardrails; policy‑as‑code. 
- 
Hands‑on AI systems integration 
- 
Experience integrating GenAI (LLMs/RAG/KG/agents) into real products with telemetry, guardrails, and rollback. 
- 
AIOps and reliability fundamentals 
- 
SLI/SLO design, error budgets, incident management, observability, CI/CD for prompts/policies/indexes. 
- 
Manufacturing/OT/IoT/Edge AI experience; familiarity with device data and shop‑floor constraints. 
- 
Microsoft Azure: Azure OpenAI, Cognitive Search, API Management, App Service/AKS, Functions, Event Hubs, Key Vault, Monitor; Azure ML or equivalent. 
Nice‑to‑have skills
- 
SAP ecosystem awareness (SAP/S4H processes and integration points for AI governance). 
- 
Data and knowledge systems 
- 
Knowledge graphs/ontologies, hybrid retrieval (vector + keyword), embeddings, data contracts. 
- 
Safety and compliance 
- 
Red‑teaming methods, PII/PHI handling, content moderation pipelines, audit trails. 
- 
Cost and performance engineering 
- 
token/call cost controls. 
Experience profile
- 
7–12+ years in software/AI product delivery with 3+ years leading cross‑functional initiatives. 
- 
Track record of shipping AI or data‑intensive systems to production in enterprise settings. 
- 
Demonstrated practice of Site Reliability Engineering (SRE)/AIOps concepts (SLOs, incident response, observability). 
