DataJobs.io
← Back to all jobs
Arcadia.io

Lead Analytics Engineer - Data Modeling & Quality

Remote Remote $160k - $185k/yr Full time Posted 9d ago

Job Description

Arcadia seeks a Lead Analytics Engineer who specializes in data modeling and data quality to own the DBT and SQL layer, transforming clinical and claims data into trusted datasets while championing data quality ownership and cross-functional collaboration within our healthcare analytics platform.

Responsibilities

  • Develop, review, and maintain DBT models implemented on Spark with Hudi across the data journey from ingestion to bronze and silver layers.
  • Guide clients through their data model, assumptions, and limitations via intentional validation.
  • Diagnose and resolve issues, then implement DBT tests to prevent regressions.
  • Enhance SQL performance for slow-running jobs.
  • Collaborate with Data Engineering on Hudi table design, partition strategies, and incremental patterns.
  • Triage data quality alerts and classify issues as source-level or transformation-level.
  • Create and maintain volume monitors and data quality monitors, including null rate, distribution, and future-date checks.
  • Define and apply clinical data quality rules (entity volume, field coverage, LOINC coverage, referential integrity) and claims validation rules across silver and gold layers.
  • Perform quality reviews for connector promotions, assessing silver entity coverage, validation rule pass rates, and bronze-to-silver transformation accuracy.
  • Own the ticket queue for data quality, attribution, hierarchy, and customer-specific data quality issues, delivering clear customer-facing findings.
  • Lead data quality reviews during connector installation and promotions (UAT/PRD), including claims validation playbooks and null analysis.
  • Collaborate with Data Engineering on root-cause triage for errors, ingress anomalies, and silver table issues surfaced by data quality monitoring.
  • Coordinate with the Measure Implementation Team when data quality issues affect quality measure scores.
  • Contribute to and enforce data modeling standards across teams.
  • Data modeling: DBT-Spark, SQL, Claude.
  • Warehousing: Amazon Redshift, Apache Hudi, AWS Athena.
  • Data quality: volume and DQ monitors, DBT tests.
  • Orchestration: Argo Workflows, Airflow.
  • Source control: Git, GitHub, PR-based review workflows.
  • Observability: Grafana, Loki, Jira.
  • Healthcare data domains include Claims (plan, professional, pharmacy), EHR entities, and MPI.

Requirements

  • Bachelor's or master's degree in computer science, statistics, business, economics, or a related field.
  • Advanced SQL skills, including window functions, complex CTEs, aggregation patterns, and performance tuning on columnar databases.
  • Hands-on DBT experience authoring models, tests, macros, and YAML documentation; familiarity with incremental strategies.
  • Working knowledge of healthcare data, including claims data (professional, institutional, pharmacy), clinical data (EHR entities), and quality dimensions (member months, coverage rates, null patterns).
  • Data quality mindset with the ability to differentiate source data issues from transform issues, design systematic validation checks, and clearly communicate findings.
  • Clear communication skills to translate technical findings for clients and non-technical stakeholders.
  • Strong analytical judgment for interpreting distributions and spotting anomalies.
  • Ability to manage multiple projects simultaneously, leveraging AI tooling to stay organized and efficient.
  • Genuine interest in learning and applying AI tools to improve operational efficiency.

Technologies

  • DBT-Spark
  • SQL
  • Claude
  • Amazon Redshift
  • Apache Hudi
  • AWS Athena
  • Argo Workflows
  • Airflow
  • Git
  • GitHub
  • Grafana
  • Loki
  • Jira
  • Healthcare data concepts and standards

Benefits

  • Collaborate with a talented team on complex healthcare data challenges.
  • Flexible, fully remote work environment with resources and support to perform at your best.
  • Exposure to senior leaders within the organization.
  • Be on the front lines of AI adoption, using cutting-edge tools to accelerate work and shape team operations in an AI-first environment.
  • Impact healthcare data operations by improving data quality, reliability, and trust for patient care decisions.
  • Join a mission-driven company transforming the healthcare industry.
  • Be part of the Arcadian Community, a diverse and energized group.

About Arcadia

Arcadia.io helps providers and payers across the country transform healthcare by aggregating large volumes of data, applying analytics to identify opportunities to improve patient care, and delivering actionable insights to clinicians at the point of care in near real time. The company is growing rapidly and positioned as a leader in the healthcare data space.

Protect Yourself

If you have concerns about the authenticity of a job offer or recruitment communications claiming to be from Arcadia, verify by calling (781) 202-3600 and selecting option 3. For more information, visit our website. This position requires adherence to all security policies to protect PHI and Arcadia intellectual property.

Would Love For You To Have

  • Experience with Spark SQL and the Hudi table format.
  • Familiarity with data quality monitoring tools.
  • Comfort operating in an AI-first environment using Claude to build and verify workflows.
  • Exposure to population health analytics concepts such as HEDIS measures, risk adjustment, and value-based care metrics.
  • Python scripting for data investigation and automation.
  • Experience with Argo Workflows or similar orchestration platforms.
  • Knowledge of healthcare data standards including ICD-10, CPT, NDC, LOINC, and NPI.

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.