Lead Machine Learning Engineer
Job Description
Lead Machine Learning Engineer role at Capital One, onsite in McLean, VA, focused on productionizing ML applications and systems at scale.
Responsibilities
- Design, develop, and deliver ML models and components to address real world business needs, collaborating with Product and Data Science teams
- Guide ML infrastructure choices using modeling considerations such as model type, data, feature engineering, training, hyperparameters, dimensionality, bias and variance, and validation
- Tackle complex problems by writing and testing application code, building and validating ML models, and automating tests and deployment
- Work within a cross functional Agile team to create software that enables advanced big data and ML applications
- Retrain, monitor, and maintain models in production environments
- Leverage or build cloud based architectures to deploy optimized ML models at scale
- Construct efficient data pipelines to feed ML models
- Apply continuous integration and deployment best practices, including test automation and monitoring, to ensure successful deployment of ML models and code
- Ensure code quality, governance, and risk controls, and promote Responsible and Explainable AI practices
- Proficient in Python, Scala, or Java
Requirements
- Bachelor’s Degree
- Minimum 6 years designing and building data intensive solutions using distributed computing (internship experience does not apply)
- At least 4 years programming with Python, Scala, or Java
- Minimum 2 years building, scaling, and optimizing ML systems
Technologies
- Python
- Scala
- Java
- scikit-learn
- PyTorch
- Dask
- Spark
- TensorFlow
- AWS
- Azure
- Google Cloud Platform
Benefits
- Health benefits
- Financial benefits
- Other benefits
- Performance-based incentive compensation (cash bonuses and/or long-term incentives)
Preferred Qualifications
- Master's or Doctoral degree in computer science, electrical engineering, mathematics, or a related field
- 3+ years building production ready data pipelines that feed ML models
- 3+ years hands on experience with an industry recognized ML framework such as scikit-learn, PyTorch, Dask, Spark, or TensorFlow
- 2+ years writing performant, resilient, and maintainable code
- 2+ years gathering and preparing data for ML models
- 2+ years people leadership experience
- 1+ years leading teams developing ML solutions using best practices, patterns, and automation
- Experience developing and deploying ML solutions in AWS, Azure, or Google Cloud Platform
- Experience designing, implementing, and scaling complex data pipelines for ML models and evaluating their performance
- Demonstrated ML industry impact through conference talks, papers, blog posts, open source contributions, or patents
- Experience using interactive AI tooling to accelerate productivity beyond basic code completion