What you will do

Lead technical discovery for AI / GPU prospects: workload type, dataset size, latency targets, parallelism strategy.
Recommend the right hardware mix — H100 / H200 / L40S / RTX 6000 Ada / A100 — with capacity, power, and budget tradeoffs.
Recommend the right software stack across open-source (PyTorch DDP / FSDP, DeepSpeed, vLLM, Triton) and licensed (NVIDIA AI Enterprise).
Write SoWs and architecture diagrams in lockstep with B2B Sales and the AI / GPU Infrastructure Engineer.
Run pre-sales POCs: spin up a sample training run, benchmark inference throughput, share signed results.
Stay current on the LLM / AI-infra landscape; brief the team monthly on what changed and what it means for our pricing.

What we need from you

3+ years working with ML / GPU infrastructure as an engineer or solutions architect.
Comfort across modern AI stacks: PyTorch, JAX awareness, Hugging Face ecosystem, vLLM, Triton.
Solid grounding in distributed training (data / tensor / pipeline parallel) and inference patterns (batching, KV cache, quantization).
Strong written and spoken communication — explains "H100 SXM vs L40S PCIe" to a CIO without losing them.
Bahasa Indonesia + English; English-first acceptable if you anchor regional / SG accounts.

Nice to have

LLM fine-tuning hands-on (LoRA / QLoRA, full FT, RLHF / DPO).
Cost-modeling for AI workloads (training $/token, inference $/1k tokens).
Indonesian regulated AI use cases (financial services, government, healthcare).

What success looks like in 90 days

Three AI / GPU customer discovery cycles run.
One signed POC, one more in late stage.
Reference architecture diagrams published for the top three AI use cases (training, fine-tuning, inference / RAG).

How to apply

Send your CV plus a short note (English or Bahasa Indonesia) telling us which two responsibilities you would tackle first and why. We read every application and reply within 7 days.

Apply → [email protected]