Machine Learning Engineer
Job Description
Robert Half invites a Machine Learning Engineer to join onsite in Los Angeles, focusing on deploying and operating scalable ML infrastructure and GenAI/LLM systems. The role centers on MLOps platforms, feature stores, vector search, RAG, and CI/CD automation on Databricks. The position offers a competitive annual salary of USD 200,000 to 260,000, along with a comprehensive benefits package and opportunities for professional development.
Benefits
- Medical insurance
- Vision insurance
- Dental insurance
- Life insurance
- Disability insurance
- 401(k) plan
- Free online training
Responsibilities
- Lead the design, implementation, and ongoing maintenance of scalable ML infrastructure on Databricks, including MLflow for experiment tracking, a model registry, and model serving endpoints.
- Oversee the development of the ML Ops platform and automated pipelines for deploying, monitoring, and maintaining models in production environments.
- Implement robust solutions for model versioning, systematic retraining, and artifact management using Databricks Unity Catalog for ML governance.
- Design and manage Databricks Feature Store to ensure consistent feature engineering across training and inference pipelines.
- Architect and implement Retrieval-Augmented Generation (RAG) systems for document Q&A, enabling business teams to query fund documents, investor letters, and market research.
- Design, deploy, and manage vector database solutions (Databricks Vector Search, Pinecone, or similar) for semantic search and retrieval across enterprise documents.
- Lead LLM fine-tuning and customization initiatives, training models like Claude or open-source alternatives with CIM proprietary data while ensuring data privacy and compliance.
- Develop and optimize document processing pipelines including PDF parsing, chunking strategies, and embedding generation for RAG applications.
- Implement prompt engineering best practices and LLM evaluation frameworks to ensure output quality, relevance, and factual accuracy.
- Build guardrails and safety measures for GenAI applications, including hallucination detection, output validation, and source attribution.
- Design and implement extensive automation across the ML workflow, covering model training, testing, validation, and deployment using Databricks Workflows and Asset Bundles.
- Set up robust CI/CD pipelines for both traditional ML models and GenAI applications, leveraging GitHub Actions, Azure DevOps, or similar tools.
- Automate complex data and model workflows utilizing orchestration tools such as Airflow, Prefect, or Databricks Workflows.
Technologies
- Databricks
- MLflow
- Databricks Unity Catalog
- Databricks Feature Store
- Databricks Vector Search
- Pinecone
- Claude
- GitHub Actions
- Azure DevOps
- Airflow
- Prefect
- Databricks Workflows
- Asset Bundles
- Python
- TensorFlow