DataJobs.io
← Back to all jobs

Job Description

The Senior PySpark Data Engineer will design, develop, and maintain data solutions within a Big Data environment, leveraging PySpark and Python to build scalable pipelines, ensure data quality, and migrate data across systems from on-premises to cloud deployments. This onsite role is based in Irving, Texas, with Tata Consultancy Services.

Responsibilities

  • Architect and maintain robust, scalable data pipelines using PySpark to support high performance data processing.
  • Develop data pipelines, ensure data quality, and implement ETL processes to migrate and deploy data across systems.
  • Translate Ab Initio ETL applications into PySpark based data pipelines.
  • Migrate on premises workloads to cloud environments (AWS, Databricks, Snowflake) according to use case requirements.
  • Collaborate with cross-functional teams to identify and resolve data related issues.
  • Stay informed of the latest advancements in data engineering and integrate innovative approaches to maintain a competitive edge.

Requirements

  • 8+ years of professional experience in Hadoop and PySpark/Python development.
  • Proven expertise in PySpark with experience handling large volumes of data.
  • 3+ years of hands-on experience with AWS, Databricks/Snowflake, and Airflow.
  • Familiarity with CI/CD pipelines and version control systems such as Git.
  • Strong debugging and problem-solving skills.
  • Excellent communication and collaboration abilities.

Technologies

  • PySpark
  • Python
  • Hadoop
  • Ab Initio
  • AWS
  • Databricks
  • Snowflake
  • Airflow
  • Git
  • Docker
  • AWS EKS

Location

Irving, TX (onsite)

Job Function

Technology

Role

Engineer

Job ID

411835

Salary

USD 120,000 - 140,000 per year

Desired Skills

  • Hadoop

Desired Candidate Profile

  • Bachelor of Computer Science

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.