I am an AI/ML systems engineer with over 10 years of experience designing and shipping production-grade machine learning infrastructure, primarily across financial services and edge-AI product environments. Throughout my career, I have successfully delivered AI-powered products to market, including those on the Microsoft App Store, while strictly adhering to regulatory, security, and auditability standards. My expertise lies in low-latency large language model (LLM) inference, retrieval-augmented generation (RAG) architectures, and real-time data pipelines.
I possess a unique blend of quantitative finance knowledge, including SVaR and risk modeling, combined with hands-on systems engineering skills in C++, accelerated computing, and distributed streaming. I excel at bridging complex technical execution with customer outcomes, ensuring that engineering solutions align with business goals. I have a proven track record of leading global engineering teams and delivering high-throughput platforms in both Wall Street and startup environments.
In my recent role at Konnek LLC, I architected and led the development of a local-first AI platform enabling sub-second LLM inference on consumer hardware, eliminating reliance on external cloud services and enhancing privacy. I designed autonomous agentic AI ecosystems and built modular, cross-platform AI frameworks in C++ and Flutter.
Previously, as Vice President and Quant Developer at Morgan Stanley, I built and scaled real-time financial analytics platforms that improved system latency and throughput by up to 30%. I led distributed teams and drove architecture improvements for critical trading and risk decision systems.
My background also includes modernizing legacy financial transaction platforms into cloud-ready microservices, designing hardware-software co-optimization systems for heterogeneous compute environments, and developing regulatory risk models supporting compliance reporting. I hold a Master of Science in Mathematics and a Bachelor of Science in Actuarial Science, complemented by professional certifications in energy risk and risk management.
Built and commercialized a local-first AI platform enabling sub-second LLM inference on consumer hardware, eliminating dependence on external cloud AI services and enabling privacy-preserving deployment. Architected and led end-to-end development of AI4All, a multi-tier production AI platform supporting local LLM inference, speech processing, semantic retrieval, and agent-based workflows. Delivered a privacy-preserving, on-device AI platform compliant with strict security standards. Shipped AI4All family of products to the Microsoft App Store allowing one-click installation. Architected an autonomous Agentic AI ecosystem with specialized agents for multi-step workflows. Designed high-performance RAG and semantic search systems with low-latency query execution. Engineered optimized inference pipelines delivering sub-second response times on commodity hardware. Built a modular, cross-platform AI framework in C++, and Flutter enabling deterministic, reproducible inference and extensible system composition.
Built and scaled a real-time financial analytics platform processing high-volume trading data, improving system latency and throughput by 20–30%. Designed and delivered a real-time streaming analytics platform using Kafka and Spark. Engineered low-latency data pipelines critical to live trading and risk decision systems. Led distributed engineering teams across New York, London, and India. Drove architecture improvements increasing throughput and system reliability under peak market hours.
Led modernization of legacy financial transaction platforms into cloud-ready microservices architecture, improving scalability and reducing system fragility. Architected migration of monolithic platforms to Spring Boot microservices. Designed and implemented high-throughput API integrations and backend services. Drove system stability improvements through refactoring and enhanced service orchestration. Mentored junior engineers and established coding standards.
Designed hardware-software co-optimization systems for heterogeneous compute environments. Developed performance-critical solutions combining hardware design (FPGA/Verilog) and low-level software (C++). Built custom compute architectures applicable to GPU/accelerator ML inference workloads.
Developed and maintained regulatory risk models (SVaR) supporting compliance reporting. Built and maintained Stressed Value-at-Risk models and automated reporting pipelines. Coordinated cross-functional delivery across risk, IT, and trading teams. Provided analytical support for model validation, governance, and regulatory review. Collaborated with global teams to align infrastructure with risk management priorities.
Jobicy
614 professionals pay to access exclusive and experimental features on Jobicy
Free
USD $0/month
For people just getting started
Plus
USD $8/month
Everything in Free, and: