Incident Controller
Gruve
About Gruve
Gruve is an innovative software services startup dedicated to transforming enterprises to AI powerhouses. We specialize in cybersecurity, customer experience, cloud infrastructure, and advanced technologies such as Large Language Models (LLMs). Our mission is to assist our customers in their business strategies utilizing their data to make more intelligent decisions. As a well-funded early-stage startup, Gruve offers a dynamic environment with strong customer and partner networks.
About the Role
Global operational authority for Major Incident & Problem Management across enterprise IT. Leads critical outage response, governs ITIL processes, Drives infrastructure stability, and mentors regional Incident Controllers across APAC, EMEA, and NLAM. Acts as singles point of operational control ensuring rapid restoration, technical coordination, executive communication, and elimination of root causes.
Key Responsibilities
Global Major Incident Leadership
- Command Sev1/Sev2 enterprise-wide outages and multi-region technical bridge calls.
- Make real-time, risk-based decisions, provide executive updates and escalate vendors as needed.
- Chair PMIRs within 24 hours and ensure corrective/preventive actions.
Incident & Problem Management Governance
- Standardize global incident classification, escalation, and priority frameworks.
- Audit records, enforce ITIL compliance, integrate Incident, Change, and Problem processes.
- Define communication protocols and ensure SLA/OLA adherence.
Technical Oversight & Problem Management
- Guide cross-domain resolution: network, security, cloud, virtualization, storage, databases, middleware, SaaS.
- Validate root causes, challenge temporary fixes, and sponsor automation/resilience initiatives.
- Lead RCAs, approve reports, maintain Known Error Database, and track remediation programs.
Operational Command & Team Leadership
- Lead regional Incident Controllers and NOC teams, establish 24x7 coverage, RACI models, and handover best practices.
- Mentor teams and serve as escalation for unresolved P1/P2 incidents.
Executive Communication & Service Resilience
- Deliver concise updates on impact, risk, recovery, ETA, and preventive actions.
- Interface with Service Owners, Business Relationship Managers, and senior IT leadership.
- Own KPIs (MTTR, incident frequency, SLA compliance, vendor performance) and drive stability improvement initiatives.
Basic Qualifications
- Bachelor’s degree in information technology, Computer Science, or related field.
- ITIL Foundation (minimum); ITIL Intermediate/Expert preferred.
- 8+ years of experience in Incident and/or Problem Management in a global enterprise environment.
- Experience managing major incidents across multi-region infrastructure environments.
- Strong understanding of infrastructure domains (Network, Cloud, Server, Database, End User, Applications).
- Experience working with third-party managed service providers.
- Proven ability to operate in a 24x7, high-pressure environment.
- The candidate must be comfortable working in a contract-based role.
Preferred Qualifications
- Strong leadership in high‑pressure and major incident situations
- Excellent stakeholder communication at technical and executive levels
- Solid analytical and root‑cause analysis skills
- Effective decision‑making, prioritization, and governance discipline
- Ability to drive accountability across globally distributed teams
- Experience collaborating across APAC, EMEA, and NLAM time zones
- Willingness to work in a 24x7 global support environment (Strictly Night Shift / General PST Shift)
- Availability to join major incident calls outside standard business hours as required
Why Gruve
At Gruve, we foster a culture of innovation, collaboration, and continuous learning. We are committed to building a diverse and inclusive workplace where everyone can thrive and contribute their best work. If you’re passionate about technology and eager to make an impact, we’d love to hear from you.
Gruve is an equal opportunity employer. We welcome applicants from all backgrounds and thank all who apply; however, only those selected for an interview will be contacted.