You are a data engineer who thrives in a highly collaborative environment, partnering with product, analytics, and engineering teams to deliver high-quality, trusted data. You’re motivated by building scalable data systems and shaping how data is modeled, governed, and consumed across a modern cloud platform. You bring strong hands-on experience with data pipelines and are equally comfortable improving existing systems as you are introducing better patterns, tools, and standards. You’re excited to work with modern data technologies and contribute to the evolution of a Databricks-based Lakehouse architecture. You enjoy translating complex product and user behavior into well-structured, reliable datasets that power analytics, experimentation, and decision-making.
What You Will Do:
- Partner with product analytics stakeholders to translate business-defined KPIs and data requirements into scalable, production-grade datasets
- Own the design, build, and operation of scalable data pipelines end-to-end (ingestion → transformation → serving)
- Own and evolve key components of our Lakehouse architecture, improving structure, scalability, and consistency across datasets
- Build andmaintainproduction-grade, well-modeled datasets (Gold layer) that power analytics and AI use cases
- Develop and enforcereusabledata engineering patterns, frameworks, and standards that reduce duplication and improve scalability
- Own data quality and reliability for production datasets, including validation, monitoring, SLAs, and incident resolution
- Productionize and scale prototype datasets and logic developed by analytics partners into reliable, maintainable data pipelines
- Build governed, purpose-built datasets to support AI/ML use cases while enforcing controlled and secure data access patterns
- Provides technical guidance to less experienced staff.
- Participates in special projects and performs other duties as assigned.
What You Bring With You:
- Strong experience building and operating data pipelines using SQL and Python in a modern cloud environment
- Deepexpertisein SQL, including query optimization and performance tuning at scale
- Experience with Databricks, Spark, or similar distributed data processing frameworks
- Strong understanding of modern data architecture patterns, including Lakehouse architecture, ELT, and layered data models (bronze/silver/gold)
- Proven experience designing data models for analytics, including dimensional or domain-oriented approaches
- Experience driving database and data engineering best practices, including schema design, migrations, and performance optimization
- Ability to take ownership of technical solutions and drive implementation in environments with limited structure or support
- Experience building reusable data frameworks, enforcing standards, and improving engineering leverage across datasets and pipelines
- Experience owning production data systems, including monitoring, debugging, and resolving data pipeline failures
- Experience working closely with business stakeholders or analysts to translate ambiguous requirements into scalable data solutions.
- Experience evaluating and adopting new data tech/tools to improve scalability, reliability, productivity
- Strong record of project execution and completion with experience with agile development practices.
- Experience with developing source controlled pipelines in a CI/CD environment.
- Experience working with product analytics data (event tracking, user behavior, experimentation) is a plus
- Strong communication skills and ability to collaborate effectively across technical and non-technical teams
Education and Experience:
- Bachelor’s degree in Computer Science or a related discipline, or equivalent practical experience; Master’s degree preferred
- 8+ years of experience in data engineering or related fields, with a strong track record of owning and operating production data systems
- Hands-on experience building and maintaining scalable data pipelines and datasets in cloud-based environments (e.g., Databricks, AWS, GCP)
- Proven ability to design and evolve data models and data pipelines that support analytics at scale, including performance optimization and reliability improvements
- Experience taking data solutions from concept to production, including monitoring, debugging, and ongoing maintenance
- Demonstrated ability to operate independently, make technical decisions, and improve data platform standards in environments with limited structure