Data Engineering Architect

Bloomberg Industry Group • Full-time • Arlington, TX, US • $120k - $150k / year • 2h ago

You are a data engineer who thrives in a highly collaborative environment, partnering with product, analytics, and engineering teams to deliver high-quality, trusted data. You’re motivated by building scalable data systems and shaping how data is modeled, governed, and consumed across a modern cloud platform. You bring strong hands-on experience with data pipelines and are equally comfortable improving existing systems as you are introducing better patterns, tools, and standards. You’re excited to work with modern data technologies and contribute to the evolution of a Databricks-based Lakehouse architecture. You enjoy translating complex product and user behavior into well-structured, reliable datasets that power analytics, experimentation, and decision-making.

What You Will Do:

Partner with product analytics stakeholders to translate business-defined KPIs and data requirements into scalable, production-grade datasets
Own the design, build, and operation of scalable data pipelines end-to-end (ingestion → transformation → serving)
Own and evolve key components of our Lakehouse architecture, improving structure, scalability, and consistency across datasets
Build andmaintainproduction-grade, well-modeled datasets (Gold layer) that power analytics and AI use cases
Develop and enforcereusabledata engineering patterns, frameworks, and standards that reduce duplication and improve scalability
Own data quality and reliability for production datasets, including validation, monitoring, SLAs, and incident resolution
Productionize and scale prototype datasets and logic developed by analytics partners into reliable, maintainable data pipelines
Build governed, purpose-built datasets to support AI/ML use cases while enforcing controlled and secure data access patterns
Provides technical guidance to less experienced staff.
Participates in special projects and performs other duties as assigned.

What You Bring With You:

Strong experience building and operating data pipelines using SQL and Python in a modern cloud environment
Deepexpertisein SQL, including query optimization and performance tuning at scale
Experience with Databricks, Spark, or similar distributed data processing frameworks
Strong understanding of modern data architecture patterns, including Lakehouse architecture, ELT, and layered data models (bronze/silver/gold)
Proven experience designing data models for analytics, including dimensional or domain-oriented approaches
Experience driving database and data engineering best practices, including schema design, migrations, and performance optimization
Ability to take ownership of technical solutions and drive implementation in environments with limited structure or support
Experience building reusable data frameworks, enforcing standards, and improving engineering leverage across datasets and pipelines
Experience owning production data systems, including monitoring, debugging, and resolving data pipeline failures
Experience working closely with business stakeholders or analysts to translate ambiguous requirements into scalable data solutions.
Experience evaluating and adopting new data tech/tools to improve scalability, reliability, productivity
Strong record of project execution and completion with experience with agile development practices.
Experience with developing source controlled pipelines in a CI/CD environment.
Experience working with product analytics data (event tracking, user behavior, experimentation) is a plus
Strong communication skills and ability to collaborate effectively across technical and non-technical teams

Education and Experience:

Bachelor’s degree in Computer Science or a related discipline, or equivalent practical experience; Master’s degree preferred
8+ years of experience in data engineering or related fields, with a strong track record of owning and operating production data systems
Hands-on experience building and maintaining scalable data pipelines and datasets in cloud-based environments (e.g., Databricks, AWS, GCP)
Proven ability to design and evolve data models and data pipelines that support analytics at scale, including performance optimization and reliability improvements
Experience taking data solutions from concept to production, including monitoring, debugging, and ongoing maintenance
Demonstrated ability to operate independently, make technical decisions, and improve data platform standards in environments with limited structure