What you will do
- Lead technical discovery for AI / GPU prospects: workload type, dataset size, latency targets, parallelism strategy.
- Recommend the right hardware mix — H100 / H200 / L40S / RTX 6000 Ada / A100 — with capacity, power, and budget tradeoffs.
- Recommend the right software stack across open-source (PyTorch DDP / FSDP, DeepSpeed, vLLM, Triton) and licensed (NVIDIA AI Enterprise).
- Write SoWs and architecture diagrams in lockstep with B2B Sales and the AI / GPU Infrastructure Engineer.
- Run pre-sales POCs: spin up a sample training run, benchmark inference throughput, share signed results.
- Stay current on the LLM / AI-infra landscape; brief the team monthly on what changed and what it means for our pricing.
What we need from you
- 3+ years working with ML / GPU infrastructure as an engineer or solutions architect.
- Comfort across modern AI stacks: PyTorch, JAX awareness, Hugging Face ecosystem, vLLM, Triton.
- Solid grounding in distributed training (data / tensor / pipeline parallel) and inference patterns (batching, KV cache, quantization).
- Strong written and spoken communication — explains "H100 SXM vs L40S PCIe" to a CIO without losing them.
- Bahasa Indonesia + English; English-first acceptable if you anchor regional / SG accounts.
Nice to have
- LLM fine-tuning hands-on (LoRA / QLoRA, full FT, RLHF / DPO).
- Cost-modeling for AI workloads (training $/token, inference $/1k tokens).
- Indonesian regulated AI use cases (financial services, government, healthcare).
What success looks like in 90 days
- Three AI / GPU customer discovery cycles run.
- One signed POC, one more in late stage.
- Reference architecture diagrams published for the top three AI use cases (training, fine-tuning, inference / RAG).
How to apply
Send your CV plus a short note (English or Bahasa Indonesia) telling us which two responsibilities you would tackle first and why. We read every application and reply within 7 days.
Apply → [email protected]