DataJobs.io
← Back to all jobs

Job Description

A Principal Data Engineer leads safety analytics data engineering for Global Medical Safety, building scalable tools with AI, ML, and GenAI on Google Cloud Platform to support pharmacovigilance efforts.

Responsibilities

  • Design and maintain production data pipelines and curated datasets that enable pharmacovigilance activities, including safety monitoring, analytics, and regulatory reporting.
  • ]Ensure outputs are reproducible, explainable, and auditable to support safety decision making and inspection readiness.
  • Enable AI, ML, and GenAI workflows for safety analytics, covering feature engineering and feature stores, embeddings and semantic retrieval, and Retrieval-Augmented Generation patterns.
  • Own the end-to-end data lifecycle for safety analytics from source system intake through transformation, serving, and downstream consumption, ensuring continuity, traceability, and data integrity.
  • Lead architectural decisions across ingestion, transformation, storage, and serving layers on GCP using tools like BigQuery, Dataform, and object storage.
  • Design, implement, and automate scalable, reusable data pipelines and architectures to address evolving safety analytics needs.
  • Establish and enforce data quality, validation, lineage, and observability standards for safety analytics datasets.
  • Define and implement data governance practices, including data contracts, schema versioning, access control, stewardship, and lifecycle management.
  • Ensure safety analytics data and systems meet Global Medical Safety requirements for reliability, auditability, and regulatory use.
  • Apply GxP validation expertise to data pipelines, analytics services, and supporting infrastructure.
  • Collaborate with quality and compliance teams to implement CSV/CSA-aligned controls, audit trails, documentation, and organizational change.
  • Balance rapid delivery with the rigor required for regulated pharmacovigilance systems.
  • Design and build APIs and microservices to operationalize safety analytics and ML capabilities, including feature serving, retrieval services, and analytics backends.
  • Deploy and operate services on GCP with emphasis on security, scalability, and observability (Cloud Run, GKE).
  • Enforce contract-first integration patterns between producing and consuming systems to ensure reliability and safe evolution.
  • Provision and manage cloud infrastructure using Terraform (Infrastructure as Code) on GCP.
  • Build and maintain CI/CD pipelines for data pipelines, analytics services, feature pipelines, and ML data assets (e.g., Jenkins).
  • Continuously optimize performance and cost efficiency of data and analytics infrastructure while maintaining compliance and reliability.
  • Serve as the technical authority and data engineering leader for Safety Analytics within Global Medical Safety.
  • Review and influence designs across pipelines, services, feature stores, and AI/ML integrations to uphold a high technical standard; collaborate with safety scientists, epidemiologists, biostatisticians, analytics teams, IT, and platform partners.
  • Communicate complex technical concepts and tradeoffs clearly to both technical and non-technical stakeholders.
  • Mentor and upskill teams through guidance and knowledge sharing on modern data, cloud, and AI technologies.

Requirements

  • Master’s degree in Computer Science, Engineering, or a related field, or equivalent experience, is required.
  • Minimum of 5 years of data engineering experience.

Technologies

  • Python
  • SQL
  • Google Cloud Platform (GCP)
  • BigQuery
  • Dataform
  • Terraform
  • Jenkins
  • Cloud Run
  • GKE

Compensation

  • Base annual pay range: USD 102,000 – 177,100

Location and work arrangement

Location: Horsham, PA, onsite. Preferred location: Horsham, PA or Titusville, NJ. Remote work considered on a case by case basis.

Benefits

  • Consolidated retirement plan (pension)
  • Savings plan (401(k))
  • Vacation: 120 hours per calendar year
  • Sick time: 40 hours per calendar year; 48 hours (Colorado residents); 56 hours (Washington residents)
  • Holiday pay, including floating holidays: 13 days per calendar year
  • Work, personal and family time: up to 40 hours per calendar year
  • Parental leave: 480 hours within one year of birth/adoption/foster care
  • Bereavement leave: 240 hours for immediate family; 40 hours for extended family per year
  • Caregiver leave: 80 hours in a 52-week rolling period
  • Volunteer leave: 32 hours per calendar year
  • Military spouse time-off: 80 hours per calendar year

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.