Senior Data Engineer

Location: United States
Rate, USD: Not specified
Work schedule: Full Time,
Links
Language skills: English
Available for Hire: Yes

About me

I am a Senior Data Engineer with 12 years of experience designing scalable data platforms, machine learning pipelines, and advanced analytics solutions across diverse industries including healthcare, e-commerce, advertising, and gaming. Throughout my career, I have led the development of complex systems such as an HR analytics platform that integrates predictive workforce modeling and executive dashboards, helping enterprise clients improve retention and decision-making.

My expertise spans data engineering and ETL/ELT processes, machine learning and AI, cloud and big data platforms, programming, analytics, and visualization. I have hands-on experience with tools like Apache Airflow, dbt, Apache Spark, Kafka, TensorFlow, PyTorch, Google Cloud Platform, AWS, Azure, and many others. I am skilled in building data pipelines, data modeling, data warehousing, and implementing data governance frameworks.

I have a strong background in machine learning, including predictive modeling, recommender systems, anomaly detection, and natural language processing. I have engineered ML pipelines for large-scale applications such as Google Ads optimization and have contributed to productionizing deep learning models that improve click-through rates and ad relevance.

My cloud expertise includes working extensively with Google Cloud Platform, AWS, and Azure, leveraging services like BigQuery, Dataflow, Redshift, Synapse, and Databricks. I am proficient in containerized deployments and Kubernetes, enabling scalable and reliable data infrastructure.

I am also experienced in programming with Python, SQL, R, Scala, and shell scripting, and I have developed CI/CD pipelines and API integrations to support data workflows. I am adept at creating executive dashboards and visualizations using Tableau, Power BI, and Looker, and I apply statistical analysis and experiment design to drive business intelligence strategies.

In addition to my technical skills, I possess strong soft skills including cross-functional collaboration, client-facing consulting, technical mentorship, documentation, and agile/scrum practices. I am passionate about data-driven decision-making and continuously strive to deliver impactful analytics solutions that support business goals.

I am eager to bring my extensive experience and leadership in data engineering and machine learning to new challenges where I can contribute to innovative projects and help organizations leverage data for strategic advantage.

Professional area

Developer & Engineer Management & Operations Web Analyst

Skills

AWS Azure Git Kafka Kubernetes MySQL Oracle PostgreSQL Pyspark - Apache Python Redshift SQL Tableau Software TensorFlow

Education

2009 – 2013 Bachelor of Computer Science @ The University of Texas at Austin

Experience

Dec 2021 – Jun 2025 Senior Data Engineer @ Dutech: Designed ETL/ELT pipelines with Airflow, dbt, and AWS Glue to unify HRIS, payroll, and engagement data, building a secure analytics platform for enterprise clients. Partnered with HR and data science teams to support predictive modeling for attrition and workforce planning, enabling leaders to reduce turnover and optimize staffing strategies. Delivered executive dashboards in Tableau and Power BI using AWS Redshift/Snowflake datasets, improving visibility into hiring funnels, diversity metrics, and workforce performance. Established data governance and access controls through AWS IAM and compliance frameworks, ensuring secure handling of sensitive workforce data. Built real-time streaming pipelines with Kafka and Spark on GCP (Pub/Sub, Dataflow, Dataproc, BigQuery), using Scala/PySpark for high-volume processing, powering instant analytics and personalization engines. Developed fraud and anomaly detection pipelines leveraging BigQuery and streaming frameworks, improving platform integrity and preventing financial losses. Designed segmentation workflows and personalization logic that improved targeting of promotions, increasing customer lifetime value by 22%. Led migration of legacy iGaming systems into Google BigQuery and Databricks, reducing infrastructure costs by 30% while scaling to high-volume workloads. Delivered self-serve analytics tools that reduced ad-hoc reporting requests by 40% and accelerated decision-making cycles across HR and gaming clients.
Mar 2020 – Oct 2021 Senior ML & Data Engineer @ Google: Engineered ML pipelines for Google Ads optimization, leveraging TensorFlow Extended (TFX), Kubeflow, and Apache Beam on Google Cloud to improve ad targeting and bidding. Built a real-time feature store capturing clickstream, keyword, and campaign data, cutting feature duplication and reducing model delivery time by 40%. Designed low-latency Dataflow and BigQuery pipelines to process billions of impressions daily, enabling timely signals for ad-serving models. Collaborated on early LLM-based ad relevance and query understanding prototypes using Transformer architectures and Vertex AI, enhancing semantic matching for search and advertising content. Partnered with research teams to productionize deep learning models that improved click-through rate (CTR) predictions and ad relevance. Implemented monitoring and drift detection systems that safeguarded predictive accuracy across diverse ad markets and geographies. Optimized distributed TPU training workflows, cutting training time by 30% and reducing costs by $1M annually. Developed A/B testing frameworks for ad-ranking models, enabling safe rollouts and rapid rollback of underperforming models. Migrated critical ML workloads to Vertex AI, accelerating deployment velocity and improving standardization across Ads teams. Authored design docs and best practices that unified ML engineering approaches across multiple Google Ads teams.
Dec 2018 – Dec 2019 Senior Data Scientist @ BigCommerce: Built and deployed recommendation algorithms for upsell and cross-sell strategies, increasing average order value by 10%. Designed churn prediction models using gradient boosting and logistic regression, driving retention campaigns that saved millions in recurring revenue. Developed fraud detection models with anomaly detection, reducing false positives in flagged transactions by 18%. Partnered with product teams on A/B and multivariate experiments, applying statistical methods in R to validate outcomes. Created customer segmentation models with clustering methods, enabling personalized marketing that boosted engagement rates. Optimized model training pipelines in Python and Spark on AWS EMR and S3, with components in Scala to improve batch performance and reduce experimentation cycles by 40%. Presented executive-level dashboards and data stories linking predictive insights to business outcomes, shaping roadmap decisions.
Jan 2015 – Oct 2018 Data Analyst @ Nomi Health: Designed and maintained SQL reporting pipelines for claims, billing, and patient records, improving data quality and operational reporting accuracy. Built Tableau/Power BI dashboards to monitor provider performance, patient throughput, and healthcare KPIs, with pipelines supported by AWS Redshift. Partnered with finance teams to detect anomalies in claims data, identifying $2M+ in cost savings and strengthening fraud prevention. Automated recurring reporting workflows in Python (pandas, NumPy), cutting manual preparation by 60% and enabling faster analysis cycles. Developed early predictive models in Python/R to forecast patient demand and optimize clinic staffing. Applied classification techniques on patient outcomes to identify high-risk populations and improve proactive care strategies. Used basic NLP methods to extract insights from clinical notes, informing adherence and treatment effectiveness studies. Supported migration to AWS Redshift, improving query performance and enabling large-scale healthcare analytics adoption.
Nov 2013 – Oct 2014 Junior Backend Developer @ SailPoint Technologies: Contributed to designing and maintaining relational schemas for identity and access management systems, ensuring normalized structures and consistent business rules. Wrote and debugged SQL queries, stored procedures, and triggers that supported reporting and authentication services. Optimized indexing and query execution plans, improving query performance by ~25% in production workloads. Assisted in database migration from Oracle to PostgreSQL, achieving zero data loss and <2 hours downtime. Partnered with senior developers to integrate backend modules with the database layer, reinforcing data consistency and integrity. Reduced recurring reporting errors by 15% by introducing data validation checks and process documentation.

Recommend this talent

Related resumes

More Resumes

Senior Data Engineer

About me

Professional area

Developer & Engineer Management & Operations Web Analyst

Skills

AWS Azure Git Kafka Kubernetes MySQL Oracle PostgreSQL Pyspark - Apache Python Redshift SQL Tableau Software TensorFlow

Education

Experience

Recommend this talent

Related resumes

Jobicy

Community

Discover

Career Tools

Hire me

About me

Professional area

Developer & EngineerManagement & OperationsWeb Analyst

Skills

AWSAzureGitKafkaKubernetesMySQLOraclePostgreSQLPyspark - ApachePythonRedshiftSQLTableau SoftwareTensorFlow

Education

Experience

Recommend this talent

Related resumes

Founding Producer & Creative Director

Executive Assistant | CEO Support & Operations

Addetto ricerca e sviluppo/Programmatore

AI Consultant | AI Specialist | Legal Technology Analyst

Home Inspector

Recommend this specialist

Jobicy+ Subscription

Developer & Engineer Management & Operations Web Analyst

AWS Azure Git Kafka Kubernetes MySQL Oracle PostgreSQL Pyspark - Apache Python Redshift SQL Tableau Software TensorFlow