Locations: London, Prague, Amsterdam | Remote or Hybrid | Full-time
We are an AI R&D team focused on cutting-edge applied research and building AI-driven products. Our recent work includes:
Leveraging test-time guided search for enhanced agent performance
Scaling task data collection for reinforcement learning in software engineering agents
Optimizing LLM training efficiency using agentic trajectories
One of our flagship projects is an AI inference and fine-tuning platform that supports scalable, fast, and cost-effective deployment of AI models.
We are looking for Senior and Staff-level Machine Learning Engineers with deep expertise in high-performance computing and distributed systems to build and optimize robust, scalable training and inference pipelines for large AI models.
Design and implement large-scale training and inference pipelines (data, tensor, context, expert, pipeline parallelism)
Optimize inference performance using advanced techniques like speculative decoding (Medusa, EAGLE, etc.), CUDA Graphs, and compile-based methods
Build custom CUDA/Triton kernels for performance-critical operations
Collaborate closely with researchers and infrastructure teams to ensure scalability and performance of AI products
Strong theoretical background in Machine Learning
Deep understanding of training/inference performance optimization for large neural networks (parallelism, attention, batching, offloading, etc.)
Expertise in at least one of the following:
Writing high-performance custom CUDA/Triton GPU kernels
Distributed training and parallelism at scale
Inference optimization (paged attention, continuous batching, speculative decoding, etc.)
Solid software engineering skills (Python-centric stack)
Experience with modern ML frameworks (PyTorch, JAX)
Familiarity with CI/CD, version control, and testing best practices
Strong communication skills and ability to work independently
Experience with modern LLM inference stacks (vLLM, SGLang, TensorRT-LLM, Dynamo)
Understanding of LLM core concepts (Flash Attention, MoE, RoPE, ZeRO, quantization, etc.)
Master’s/PhD in CS, AI, Data Science, or related field
Previous experience delivering production-grade products in startup-like environments
Experience with complex engineering systems or distributed data pipelines
Contributions to open-source projects
Join the Znoydzem community.
Similar Resumes