Home Vacancies Software Developer ML Engineer (Large Language model)

Vacancy in : ML Engineer (Large Language model) Salary

Locations: London, Prague, Amsterdam | Remote or Hybrid | Full-time

We are an AI R&D team focused on cutting-edge applied research and building AI-driven products. Our recent work includes:

Leveraging test-time guided search for enhanced agent performance
Scaling task data collection for reinforcement learning in software engineering agents
Optimizing LLM training efficiency using agentic trajectories

One of our flagship projects is an AI inference and fine-tuning platform that supports scalable, fast, and cost-effective deployment of AI models.

We are looking for Senior and Staff-level Machine Learning Engineers with deep expertise in high-performance computing and distributed systems to build and optimize robust, scalable training and inference pipelines for large AI models.

Key Responsibilities:

Design and implement large-scale training and inference pipelines (data, tensor, context, expert, pipeline parallelism)
Optimize inference performance using advanced techniques like speculative decoding (Medusa, EAGLE, etc.), CUDA Graphs, and compile-based methods
Build custom CUDA/Triton kernels for performance-critical operations
Collaborate closely with researchers and infrastructure teams to ensure scalability and performance of AI products

Requirements:

Strong theoretical background in Machine Learning
Deep understanding of training/inference performance optimization for large neural networks (parallelism, attention, batching, offloading, etc.)
Expertise in at least one of the following:
- Writing high-performance custom CUDA/Triton GPU kernels
- Distributed training and parallelism at scale
- Inference optimization (paged attention, continuous batching, speculative decoding, etc.)
Solid software engineering skills (Python-centric stack)
Experience with modern ML frameworks (PyTorch, JAX)
Familiarity with CI/CD, version control, and testing best practices
Strong communication skills and ability to work independently

Nice to Have:

Experience with modern LLM inference stacks (vLLM, SGLang, TensorRT-LLM, Dynamo)
Understanding of LLM core concepts (Flash Attention, MoE, RoPE, ZeRO, quantization, etc.)
Master’s/PhD in CS, AI, Data Science, or related field
Previous experience delivering production-grade products in startup-like environments
Experience with complex engineering systems or distributed data pipelines
Contributions to open-source projects

Join the Znoydzem community.

Apply as a Specialist

Similar Resumes

Database Developer/Architect

We are looking for a Senior Database Developer/Architect who is interested to advance their skills in Databases and explore the Data Analysis domain....

Senior Backend Developer (Node.js)

About UsWe are an American fintech startup developing a cutting-edge fundraising platform for nonprofit organizations. Our platform empowers charitabl...

Senior Fullstack Developer

About UsWe are an American fintech startup developing a cutting-edge fundraising platform for nonprofit organizations. Our platform empowers charitabl...

.NET Developer (Game Client & Tooling Support)

We are seeking a talented .NET Developer to join a high-performing team responsible for supporting and maintaining several mobile game clients built w...