I am a Data Scientist with expertise in machine learning, statistical analysis, and scalable analytics for large-scale data. Currently, I serve as the Lead Data Scientist at NX Prenatal, where I built a first-of-its-kind disease prediction system and helped raise $150K in external funding. My previous role was as a Senior Data Scientist and Researcher at the University of South Florida, where I contributed to scientific publications and developed machine learning and data analytics pipelines used at 23 sites internationally. I played a key role in raising $6M in external funding. As a Green Card Holder, I am open to relocation for new opportunities. My passion lies in leveraging data to drive impactful decisions and innovations in healthcare and beyond.
Developed a scalable statistical pipeline to accelerate R&D using large-scale data, securing $150K in external funding from Johnson & Johnson. Developed and deployed a first-of-its-kind ML model for disease risk forecasting to support clinical decision-making, achieving 80% accuracy. Designed ETL pipelines and engineered a SQL database to streamline data access for research and analytics teams, achieving ~$50K in cost savings. Led teams and partnered with product managers to translate business needs into machine learning solutions (XGBoost), boosting prediction accuracy by 13% through iterative development and A/B testing. Communicated findings, data insights, and KPIs to describe business results in measurable scales to technical and business teams weekly.
Designed and implemented end-to-end deep learning and statistical pipelines for time-series forecasting, improving model accuracy by 25%. Developed a data pipeline and machine learning model using large-scale data for disease prediction and risk forecasting with 92% accuracy. Collaborated on designing large-scale data analysis workflows in R for 990K+ data samples to identify risk factors and help in decision making. Collaborated with global, cross-functional teams to analyze 5M+ healthcare records for high-impact analytics for biomarker discovery. Delivered data insights via dashboards, visualizations, and presentations to stakeholders, driving a 30% increase in data-informed decisions.
Developed an efficient and scalable data analysis pipeline (R and Spark) for large-scale datasets of ~250 GB, reducing processing time by ~30%. Developed statistical pipelines, performed code reviews, and versioning for large-scale data analytics; deployed across 23 locations internationally. Mentored junior data scientists in machine learning, model development, and best practices using scikit-learn, boosting model accuracy by 15%. Led and contributed to project planning, data acquisition, data preprocessing, quality control, and analysis in securing $6M in external funding. Designed, tested, and documented interactive RShiny software, improving data analysis and operational decision-making productivity by 7-fold. Managed complex, cross-functional projects using Excel, PowerPoint, and JIRA, ensuring timely delivery and alignment with business goals.
Jobicy
571 professionals pay to access exclusive and experimental features on Jobicy
Free
USD $0/month
For people just getting started
Plus
USD $8/month
Everything in Free, and: