DataJobs.io
← Back to all jobs

Job Description

Join Lawrence Berkeley National Laboratory’s Joint Genome Institute in the San Francisco Bay Area with hybrid work options, a supportive culture, and a competitive compensation package. This role offers a salary range of USD 158,808 to 267,996 per year, strong health and retirement benefits, and a commitment to work-life balance and professional growth.

We are seeking a Lead Scientific Data Engineer to provide senior technical leadership for core genomic data systems, data management, job orchestration, and platform integration that drive AI enabled scientific discovery.

Responsibilities

  • Provide senior technical leadership for JGI's core scientific data and compute platforms by crafting implementation roadmaps, data system architectures, and long term strategy.
  • Design and implement production automated systems, APIs, and workflows that enable genomic data movement, metadata management, job orchestration, data access, and large scale scientific computing.
  • Improve reliability, scalability, observability, interoperability, and maintainability of shared production data systems while supporting sustainable operations and delivery.
  • Collaborate with product managers, scientists, and users to drive cross team alignment and integration decisions that address complex technical challenges and shared priorities.

Requirements

  • Bachelor’s degree in Computer Science or a related field and a minimum of 12 years of professional experience with large scale scientific data and compute infrastructures, or an equivalent combination of education and experience.
  • Proven experience leading the design, development, integration, and operation of production software and data systems that support metadata management, workflow orchestration, data lifecycle operations, and broad user data access.
  • Advanced knowledge of data and software engineering fundamentals relevant to data intensive distributed systems, including system design, concurrency, performance, and testing.
  • Wide experience with databases and data storage technologies, including relational databases, object storage, and systems managing structured, semi-structured, and large scale data.
  • Experience with data engineering and event driven technologies such as Airflow or Kafka.
  • Strong experience using AI coding agents like Claude Code, Codex, or Cursor, with the ability to review and validate generated software for production quality, security, and maintainability.
  • Proficiency in Python and experience with one or more additional programming languages.
  • Excellent communication skills, with the ability to present complex technical information to diverse audiences.
  • Demonstrated ability to lead through influence in interdisciplinary environments, aligning users, stakeholders, and engineering teams around shared requirements and implementation plans.

Technologies

  • Airflow
  • Kafka
  • Claude Code
  • Codex
  • Cursor
  • Python
  • WDL
  • Nextflow

Benefits

  • Exceptional health and retirement benefits, including pension or 401K style plans
  • A belonging culture with strong investment in team wellbeing and growth
  • Vacation and sick time plus a Winter Holiday Shutdown each year
  • Parental bonding leave for both mothers and fathers
  • Pet insurance

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.