My job alerts

Cloud Infrastructure and Service Orchestrator Architect

Gruve

Other Engineering, IT

Tokyo, Japan

Posted on Sep 4, 2025

Apply now

About Gruve

Gruve is an innovative software services startup dedicated to transforming enterprises to AI powerhouses. We specialize in cybersecurity, customer experience, cloud infrastructure, and advanced technologies such as Large Language Models (LLMs). Our mission is to assist our customers in their business strategies utilizing their data to make more intelligent decisions. As a well-funded early-stage startup, Gruve offers a dynamic environment with strong customer and partner networks.

About the Role

We are seeking an expert Data Center & Cloud Infrastructure and Service Orchestrator Architect to design and implement the service orchestration layer that will deploy and manage diverse workloads on top of our multi-region cloud infrastructure. This role focuses on creating the intelligent orchestration systems that automate the deployment, scaling, and management of applications, databases, AI/ML services, and other cloud services. This position is being hired to support a Gruve customer and may require on-site work at the customer’s location.

Key Responsibilities

Service Orchestration Platform Design

Design comprehensive service orchestration platforms for automated workload deployment and management
Architect API-driven service provisioning systems with self-service capabilities
Design multi-tenant service isolation and resource allocation frameworks
Create service lifecycle management systems including deployment, scaling, updates, and decommissioning

Workload Orchestration Architecture

Design orchestration systems for diverse workload types:

Virtual machine provisioning and management
Container orchestration using Kubernetes
Database service deployment (SQL, NoSQL, distributed databases)
Message queue services (Kafka, RabbitMQ, Apache Pulsar)
GPU-accelerated AI/ML services and model inference platforms
Large Language Model (LLM) fine-tuning and inference services similar to AWS Bedrock

AI/ML Service Orchestration

Architect AI/ML pipeline orchestration for model training, validation, and deployment
Design GPU resource scheduling and allocation systems for distributed training
Create model serving infrastructure with auto-scaling and load balancing
Design MLOps platforms for continuous integration and deployment of ML models
Architect LLM inference services with dynamic scaling and cost optimization

Service Discovery and Integration

Design service mesh architectures for microservices communication
Architect API gateway and service proxy solutions
Create service discovery, configuration management, and secrets management systems
Design inter-service communication patterns and protocols

Automation and DevOps Integration

Design CI/CD pipelines integrated with service orchestration platforms
Architect GitOps workflows for declarative service management
Create policy-based governance and compliance automation
Design cost management and resource optimization automation

Basic Qualifications

12+ years of experience building and managing distributed systems and service orchestration architectures.
Expertise in Kubernetes and container orchestration, with more than 8 years of hands-on experience.
Designed and deployed AI/ML infrastructure using platforms like Kubeflow and model serving tools such as NVIDIA Triton or TorchServe.
Proficient in Go and Python, and skilled in writing automation scripts using Bash or Python.
Bachelor's degree in Computer Science or a related field and possess strong system design and architecture capabilities.

Preferred Qualifications

You have experience building PaaS or IaaS offerings for internal or external developer platforms.
You are familiar with edge computing paradigms and serverless technologies including function-as-a-service frameworks.
You hold certifications such as CKA, CKAD, CKS, or equivalent credentials from major cloud providers.
You’ve worked on cloud cost optimization initiatives and have hands-on experience with FinOps practices.
You hold a Master’s degree in distributed systems, computer science, or a closely related discipline.

Why Gruve

At Gruve, we foster a culture of innovation, collaboration, and continuous learning. We are committed to building a diverse and inclusive workplace where everyone can thrive and contribute their best work. If you’re passionate about technology and eager to make an impact, we’d love to hear from you.

Gruve is an equal opportunity employer. We welcome applicants from all backgrounds and thank all who apply; however, only those selected for an interview will be contacted.

Apply now

See more open positions at Gruve

Job board

Cloud Infrastructure and Service Orchestrator Architect