JOB DESCRIPTION
We are seeking an experienced and visionary Senior Full-Stack Data Engineer to lead the architecture, development, and optimization of a next-generation data platform. This is a critical role for an individual with over 15 years of deep data engineering expertise, capable of driving technical direction, mentoring team members, and delivering high-impact solutions in a fast-paced project environment
JOB RESPONSIBILITIES
Key Responsibilities:
- Platform Strategy & Leadership
- Technical Direction: Define and champion the architectural roadmap and best practices for our end-to-end data pipelines, ensuring scalability, reliability, and security across the platform.
- Team Mentorship & Project Velocity: Act as a primary technical mentor, guiding a team of engineers, conducting code reviews, and aggressively driving the project timeline to ensure rapid delivery of data products.
- Stakeholder Collaboration: Partner with Data Scientists, Analysts, and business stakeholders to translate complex requirements into robust, production-ready data solutions.
- Collaboration with Data Scientists and ML Engineers: Data Accessibility, Support for Model Development, Data Quality Assurance
- Data Pipeline Development & Management
- Ingestion & Transformation: Design, build, and optimize high-volume data ingestion and transformation jobs using tools like dbt Core, AWS Glue, or Flexter, ensuring data quality and integrity.
- Workflow Orchestration: Develop and maintain sophisticated data pipelines using orchestrators such as Dagster or Talend, focusing on modularity and reusability.
- Streaming & Real-time Integration: Implement and manage real-time data flows utilizing Confluent platforms or native AWS streaming services (e.g., Kinesis) for immediate data availability.
- Data Security and Privacy: Data Anonymization, Compliance with Regulations
- Be well versed with DataOps and DevOps fundamentals
- Assist and drive the Data Ecosystem Management & Monitoring
- Open Table Formats & Management: Implement and maintain the Iceberg open table format, utilizing tools like Upsolver (Talend Open Lakehouse) for efficient schema evolution and data management.
- Compute Engine Optimization: Optimize query performance and cost efficiency across our primary compute engines: Snowflake, Amazon Redshift, and AWS Athena.
Observability & Monitoring: Integrate comprehensive monitoring and observability into all pipelines using Splunk to ensure high availability, rapidly identify bottlenecks, and troubleshoot production issues
JOB QUALIFICATIONS
- 15+ Years of hands-on, progressive experience in Data Engineering, Data Architecture, or a closely related Full-Stack Data role
- Deep conceptual understanding of core data engineering principles, including data modeling (e.g., Dimensional, Data Vault), ETL/ELT patterns, and metadata management
- Proven track record of building and managing petabyte-scale data infrastructure in a cloud-native environment
- Insurance industry experience preferred but not mandatory
- Tools:
- Cloud Environment: AWS (S3, IAM, VPC, etc.)
- Experience with Talend, dbt Core, Iceberg, AWS Glue Catalog, Snowflake, Redshift, Athena, Splunk, AWS streaming services, Git
Strong SQL, Pyspark and Python