DataJobs.io
← Back to all jobs
Capital One

Lead Machine Learning Engineer (MLOps, KServe + building Kubernetes Clusters, PyTorch, TensorFlow on AWS)

Richmond, VA $179k - $205k/yr Full time Posted 30d ago

Job Description

Capital One is seeking a Lead Machine Learning Engineer to scale ML solutions across the organization, focusing on architecture, model development, deployment, and operations within an Agile environment. This role centers on productionizing ML applications and systems, building robust data pipelines, and delivering ML capabilities at scale.

Location

Richmond, VA (onsite)

Compensation

Salary: USD 179,400 - 204,700 per year

Responsibilities

  • Design, build, and deliver ML models and components that address real-world business problems, partnering with Product and Data Science teams.
  • Inform ML infrastructure decisions using knowledge of modeling techniques, including model choice, data and feature selection, training, hyperparameter tuning, dimensionality, bias/variance, and validation.
  • Solve complex problems by writing and testing application code, developing and validating ML models, and automating tests and deployment.
  • Collaborate within a cross-functional Agile team to create and improve software for state-of-the-art big data and ML applications.
  • Retrain, maintain, and monitor models in production.
  • Leverage or build cloud-based architectures and platforms to deliver optimized ML models at scale.
  • Construct optimized data pipelines to feed ML models.
  • Apply continuous integration and continuous deployment best practices, including test automation and monitoring, to ensure successful deployment of ML models and code.
  • Ensure code quality and governance, manage risk for models, and adhere to Responsible and Explainable AI practices.
  • Use programming languages such as Python, Scala, or Java.

Requirements

  • Bachelor’s degree
  • At least 6 years of experience designing and building data-intensive solutions using distributed computing (internship experience does not apply)
  • At least 4 years of experience programming with Python, Scala, or Java
  • At least 2 years of experience building, scaling, and optimizing ML systems

Technologies

  • KServe, Kubernetes, PyTorch, TensorFlow
  • AWS, Python, Scala, Java
  • scikit-learn, Dask, Spark

Benefits

  • This role is eligible for performance-based incentive compensation, which may include cash bonuses and/or long-term incentives (LTI). Incentives can be discretionary or non-discretionary depending on the plan.
  • Health, financial, and other benefits that support total well-being.

Team Description

The Intelligent Foundations and Experiences (IFX) team is at the center of bringing our vision for AI at Capital One to life. We work hand-in-hand with our partners across the company to advance the state of the art in science and AI engineering, and we build and deploy proprietary solutions that are central to our business and deliver value to millions of customers. Our AI models and platforms empower teams across Capital One to enhance their products with the transformative power of AI, in res

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.