DataJobs.io
← Back to all jobs

Job Description

The Data Scientist / Data Analytics Engineer role at Transflo is a remote-based position focused on designing, building, and operationalizing advanced analytics across transportation and logistics, delivering both predictive and point-in-time insights on AWS.

Responsibilities

  • Design, train, validate, and deploy predictive models spanning regression, classification, time-series forecasting, survival analysis, clustering, anomaly detection, and gradient-boosted or deep learning approaches as appropriate to the problem.
  • Lead model selection, hyperparameter tuning, cross-validation, and rigorous performance evaluation using business-aligned metrics such as precision/recall trade-offs, MAPE, RMSE, lift, and calibration.
  • Develop data products in transportation domains including operational metrics, fraud signals, pricing analytics, and industry trends.
  • Establish model monitoring, drift detection, retraining cadence, and explainability practices (SHAP, feature importance, partial dependence) to keep production models trustworthy.
  • Produce point-in-time analytics, KPI scorecards, and exception reporting to support daily decisions across dispatch, fleet, customer success, finance, and product teams.
  • Partner with business stakeholders to translate questions into well-scoped analyses and deliver clear, defensible insights with documented assumptions and data lineage.
  • Build and maintain reusable analytical datasets, semantic layers, and certified metrics to ensure a consistent source of truth.
  • Build and maintain data pipelines (batch and streaming) on AWS using Redshift, S3, Glue, Lambda, Step Functions, Kinesis, MSK, EMR, Athena, and SageMaker.
  • Implement medallion architecture to progressively refine raw operational data into analytics-ready and ML-ready datasets.
  • Apply STARR dimensional modeling to construct performant data models in Redshift and the warehouse layer.
  • Drive data selection, curation, profiling, and quality enforcement; define source-of-truth datasets, document lineage, and codify data contracts and validation tests.
  • Collaborate with data engineering and platform teams on CI/CD for data and ML assets, infrastructure as code, and cost-aware AWS design.
  • Turn customer-facing analytics ideas into shipped capabilities through partnerships with product management, design, and engineering.
  • Contribute to product discovery through customer interviews, opportunity sizing, prototyping, and rapid iteration of analytical concepts.
  • Own the analytical correctness of customer-facing metrics, models, and visualizations, including definitions, edge cases, and explanations for non-technical users.
  • Define and measure success metrics for analytics features and drive iterative improvements post-launch.
  • Translate complex analyses into clear narratives and visuals for technical and non-technical audiences, including executives and customers.
  • Partner cross-functionally with product, engineering, operations, and commercial teams to embed analytics into workflows and customer-facing products.
  • Mentor analysts and engineers on statistical rigor, modeling best practices, and modern data architecture.

Requirements

  • Bachelor's degree in Statistics, Mathematics, or Supply Chain Management; Computer Science is acceptable. Master’s degree preferred but not required.
  • Professional experience in transportation, trucking, freight, logistics, or related supply chain fields, with working knowledge of operational data (loads, stops, shipments, ELD/telematics, TMS, dispatch, billing, etc.).
  • Proven track record launching customer-facing analytics products from idea through production, including discovery, scoping, model and metric design, collaboration with product/engineering, and production support with real customers. An end-to-end example is expected.
  • Strong experience building end-to-end analytics models, including problem framing, data curation, feature engineering, model training and validation, and deployment.
  • Hands-on experience with AWS PaaS and analytics tooling, including Redshift and related services (S3, Glue, Lambda, Step Functions, Athena, Kinesis, EMR, SageMaker).
  • Proficiency in SQL (advanced window functions, performance tuning on Redshift or similar) and at least one analytics programming language such as Python, with libraries like pandas, scikit-learn, statsmodels, XGBoost/LightGBM, and PyTorch or TensorFlow as appropriate.
  • Experience designing and operating production data pipelines with attention to orchestration, idempotency, observability, and data quality.
  • Solid grounding in statistical methods including hypothesis testing, experimental design, regression, time-series analysis, and uncertainty quantification.

Technologies

  • AWS, Redshift, S3, Glue, Lambda, Step Functions, Kinesis, MSK, EMR, Athena, SageMaker
  • Python, pandas, scikit-learn, statsmodels, XGBoost, LightGBM, PyTorch, TensorFlow
  • Jupyter, SQL
  • BI / Visualization tools: QuickSight, Power BI, Looker
  • Orchestration / DevOps: Airflow, Git, CI/CD, Terraform, CloudFormation
  • Medallion architecture, STARR modeling

Preferred Qualifications

  • Master's degree in Statistics, Mathematics, Operations Research, Supply Chain, Computer Science, or a closely related quantitative field.
  • Experience implementing medallion architecture in a cloud data lakehouse or warehouse environment.
  • Experience designing STARR or star-schema dimensional models for analytics consumption.
  • Experience with streaming or event-driven data (Kinesis, Kafka/MSK) for near real-time analytics in transportation contexts.
  • Experience deploying and monitoring ML models in production using SageMaker, MLflow, or equivalent MLOps tooling.
  • Familiarity with BI visualization tools and semantic layer concepts.
  • Exposure to optimization or operations research techniques applied to transportation problems.
  • Experience with ELD/HOS data, telematics feeds, geospatial data, or TMS/dispatch data and transportation backoffice operations.

Core Competencies

  • Analytical rigor and the ability to defend methodology, assumptions, and uncertainty.
  • Business pragmatism and the capacity to ship value quickly with practical models.
  • Product mindset for customer-facing analytics and willingness to iterate with product and engineering partners.
  • Engineering discipline, reproducibility, and data lineage awareness.
  • Stakeholder partnership and clear communication of trade-offs.
  • Curiosity and ownership in identifying data quality issues and driving root-cause resolution.

Representative Tech Environment

  • Cloud and Data Platform: AWS stack including Redshift, S3, Glue, Lambda, Step Functions, Athena, Kinesis, EMR, SageMaker
  • Modeling and Analysis: Python with pandas, scikit-learn, statsmodels, XGBoost/LightGBM, PyTorch/TensorFlow; SQL; Jupyter
  • Data Architecture: Medallion approach; STARR models; data contracts and lineage tooling
  • Orchestration and DevOps: Airflow, Step Functions, Git, CI/CD, Terraform or CloudFormation
  • Visualization: QuickSight, Power BI, Looker

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.