DataJobs.io
← Back to all jobs

Job Description

Machine Learning Engineer for AWS Neuron Inference on Annapurna ML, onsite in Seattle, WA, focused on building blocks for Neuron distributed inference on Trn2/Trn3 and optimizing LLM inference performance.

Responsibilities

  • Develop, enable, and finely tune core components across major ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek, and more.
  • Build optimized building blocks for the Neuron distributed inference library, calibrating them to maximize performance on Trn2 and Trn3 servers.
  • Define metrics, implement automation, and drive improvements while identifying and resolving root causes of software defects.
  • Engage in design discussions, conduct code reviews, and communicate with internal and external stakeholders.
  • Collaborate cross-functionally with teams across Neufon in a fast-paced, startup-like development environment that adapts to evolving AI priorities.

Requirements

  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship experience in design or architecture of new or existing systems (design patterns, reliability and scaling)
  • Experience programming in at least one software programming language
  • 3+ years of full software development lifecycle experience, including coding standards, code reviews, source control management, build processes, testing, and operations
  • Bachelor's degree in computer science or equivalent

Technologies

  • Python
  • PyTorch
  • JAX
  • AWS Neuron

Benefits

  • Health insurance
  • Adoption and Surrogacy Reimbursement coverage
  • 401(k) matching
  • Paid time off
  • Parental leave

Description

AWS Neuron provides the complete software stack for AWS Inferentia and Trainium cloud-scale ML accelerators, including the Trn2 and upcoming Trn3 servers that leverage them.

This role targets a software engineer within the Machine Learning Applications (ML Apps) team for AWS Neuron.

This position develops, enables, and performance tunes building blocks for all key ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek and beyond.

About the Team

  • Our team is dedicated to supporting new members.
  • We maintain a broad mix of experience levels and tenure, fostering an environment that emphasizes knowledge sharing and mentorship.
  • Senior members participate in one-on-one mentoring and thorough, constructive code reviews.
  • Career growth is a priority, with projects assigned to help engineers expand expertise and tackle increasingly complex tasks.

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.