I am a senior data engineer with 7+ years of experience building scalable data platforms, cloud-native pipelines, and analytics solutions across healthcare, pharmaceutical, supply chain, and automotive domains.
I specialize in designing enterprise data lakes, lakehouse architectures, ETL/ELT frameworks, and dimensional data models using Python, PySpark, SQL, Azure Databricks, Snowflake, Azure Data Factory, and AWS data services.
I have worked extensively with large-scale structured and semi-structured datasets, focusing on data quality, performance optimization, governance, and reliable delivery of trusted analytics data.
In my recent roles, I have supported healthcare and supply chain operations by developing high-volume pipelines, curated datasets, and reporting layers that power business intelligence, operational analytics, and regulatory reporting.
I also have experience enabling machine learning workflows by preparing feature engineering pipelines, model-ready datasets, and supporting experiment tracking with MLflow.
I work closely with cross-functional teams in Agile environments and take pride in improving pipeline reliability, reducing latency, and building secure, scalable, and maintainable data solutions.
My background includes both modern cloud data engineering and earlier SQL/ETL development, giving me a strong foundation across the full data lifecycle from ingestion and transformation to reporting and production support.
Columbus, Ohio
Guntur, India
Developed and maintained enterprise data pipelines for pharmacy operations, healthcare supply chain analytics, inventory management, and business reporting. Designed scalable lakehouse solutions using Azure Databricks, ADLS Gen2, Delta Lake, and ADF. Built ingestion and transformation frameworks with Python, PySpark, SQL, and ADF. Implemented ETL/ELT pipelines, medallion architecture, dimensional models, dbt frameworks, data quality controls, feature engineering pipelines, MLflow support, governance controls, and CI/CD automation.
Built scalable healthcare data pipelines using Python, SQL, PySpark, ADF, and Azure Databricks. Processed millions of healthcare records daily for claims analytics, member analytics, population health, and regulatory reporting. Developed reusable ETL/ELT pipelines, data ingestion frameworks, data quality controls, dimensional models, and reporting integrations with Power BI. Worked in HIPAA-compliant environments with RBAC, managed identities, and encryption.
Developed data processing pipelines using Python, PySpark, and SQL on AWS EMR. Built batch ingestion and transformation frameworks, Amazon S3 data lakes, Redshift models, Hive tables, and Power BI dashboards. Implemented ETL pipelines, data quality controls, automation scripts, production support, and CI/CD-related activities.
Developed and optimized T-SQL objects, SSIS-based ETL solutions, SSRS reports, and SQL Server Agent jobs. Built data warehouse workflows, fact and dimension tables, SCD implementations, validation and reconciliation processes, and technical documentation. Supported testing, troubleshooting, deployments, and production maintenance.
Jobicy
617 professionals pay to access exclusive and experimental features on Jobicy
Free
USD $0/month
For people just getting started
Plus
USD $8/month
Everything in Free, and: