Lead Data Engineer - Data Transformation (Modeling and Architecture)
Job Description
Capital One is seeking a Lead Data Engineer focused on data transformation, modeling, and architecture. The role centers on designing scalable data models, enforcing governance, and delivering AI-ready architectures across Data Lake and Data Warehouse environments on cloud platforms.
Responsibilities
- Develop and maintain comprehensive data models across conceptual, logical, and physical layers to support scalable architectures and strong data integrity across enterprise systems.
- Lead the design of the organization's data landscape by applying Consumer Driven design principles, ensuring data structures reflect business realities and evolving needs.
- Architect and implement robust data ecosystem solutions, including Data Lake and Data Warehouse patterns, to meet diverse analytical and operational requirements.
- Support high performance data pipelines and complex transformations using SQL, Spark, and Python to process large-scale datasets efficiently.
- Define and enforce rigorous data governance standards while managing metadata frameworks to ensure data compliance and discoverability.
- Translate complex technical concepts into actionable business insights and independently lead initiatives while collaborating with stakeholders to achieve goals.
- Contribute to the evolution of the data ecosystem by designing AI ready architectures and scalable solutions.
- Collaborate with Agile teams to design, develop, implement, and support technical solutions across the organization.
- Work with a team of developers experienced in machine learning, AI, distributed microservices, and full stack systems.
- Share expertise by staying current with technology trends, experimenting with new tools, participating in internal and external communities, and mentoring other engineers.
- Partner with digital product managers to deliver robust cloud based solutions that enable powerful experiences for millions of Americans seeking financial empowerment.
Requirements
- Bachelor’s Degree.
- At least 4 years of experience in application development (internship experience does not apply).
- At least 2 years of experience with big data technologies.
- At least 1 year of cloud computing experience with AWS, Microsoft Azure, or Google Cloud.
Technologies
- SQL, Spark, Python, Scala, Java
- MapReduce, Hadoop, Hive, EMR
- Kafka, Gurobi, MySQL
- MongoDB, Cassandra
- Redshift, Snowflake
- AWS, Microsoft Azure, Google Cloud
- UNIX/Linux, shell scripting
- Data Lake, Data Warehouse
Benefits
- Health benefits
- Financial benefits
- Performance-based incentives (cash bonuses and/or long-term incentives)
Basic Qualifications
- Bachelor’s Degree
- At least 4 years of experience in application development (internship experience does not apply)
- At least 2 years of experience with big data technologies
- At least 1 year of cloud computing experience with AWS, Microsoft Azure, or Google Cloud
Preferred Qualifications
- 4+ years of experience in Data Architecture / Data Modeling
- 7+ years of experience in application development including Python, SQL, Scala, or Java
- 4+ years of experience with a public cloud (AWS, Microsoft Azure, Google Cloud)
- 4+ years of experience with distributed data / computing tools (MapReduce, Hadoop, Hive, EMR, Kafka, Spark, Gurobi, or MySQL)
- 4+ years of experience working on real time data and streaming applications
- 4+ years of experience with NoSQL implementations (Mongo, Cassandra)
- 4+ years of data warehousing experience (Redshift or Snowflake)
- 4+ years of experience with UNIX/Linux including basic commands and shell scripting
- 2+ years of experience with Agile engineering practices
- Experience leveraging interactive AI tooling to accelerate productivity beyond basic code completion
Similar Jobs
N