Data Engineer
Job Description
Data Engineer responsible for designing, developing, and maintaining scalable data pipelines and architectures that enable data driven decision making, with the ability to obtain a security clearance.
Responsibilities
- Design, implement, and optimize end-to-end ETL pipelines to move data between systems, leveraging Informatica, Talend, and shell scripting for automation.
- Build and maintain scalable data warehouses and data lakes on platforms such as Azure Data Lake and Hadoop ecosystems to support large-scale analytics.
- Develop and tune complex SQL databases (Microsoft SQL Server, Oracle, and related systems) to ensure performance, security, and reliability.
- Collaborate with cross functional teams to capture data requirements and translate them into technical solutions using Python, Java, Bash, and RESTful APIs.
- Utilize big data frameworks like Hive, Spark, and Hadoop to process large datasets while preserving data quality and consistency.
- Integrate diverse data sources to enrich datasets for analysis; use Looker for visualization and reporting.
- Support model training and analysis by delivering clean, structured datasets and contribute to data model improvements through iterative testing.
- Participate in Agile development cycles and document processes for ongoing maintenance.
Requirements
- Ability to obtain a security clearance.
- Experience designing and implementing large-scale data pipelines on AWS services such as AWS Glue and S3; familiarity with Azure Data Lake is a plus.
- Strong programming skills in Python, Java, VBA, Bash (Unix shell), or Shell scripting for automation.
- Extensive knowledge of SQL databases including Microsoft SQL Server and Oracle; solid data warehouse concepts.
- Hands-on experience with big data frameworks in the Hadoop ecosystem (HDFS, Hive) and Spark.
- Proficiency with ETL tools such as Talend or Informatica; familiarity with RESTful API integration for data exchange.
- Experience using analytics platforms like Looker to create dashboards and reports that drive business insights.
- Ability to design efficient database schemas and optimize query performance; strong data analysis skills.
- Knowledge of model training techniques for predictive analytics; experience in Agile environments preferred.
Technologies
- Informatica
- Talend
- Shell Scripting
- Azure Data Lake
- Hadoop
- HDFS
- Hive
- Spark
- Microsoft SQL Server
- Oracle
- Python
- Java
- VBA
- Bash (Unix shell)
- RESTful APIs
- Looker
- AWS Glue
- S3
- AWS
Benefits
- 401(k)
- Dental insurance
- Health insurance
- Paid time off
- Tuition reimbursement
- Vision insurance
Work location
- Onsite in Burlington, Vermont
Compensation
- USD 114,880.73 to 138,350.98 per year