I am a Senior AI/ML Engineer with over 5 years of experience delivering production AI systems, including more than 2 years leading cross-functional teams of up to 15 members in client-facing environments. I specialize in building and scaling large language model (LLM) and retrieval-augmented generation (RAG) platforms, having successfully managed systems handling 2,000 requests per minute and over 20,000 daily messages. My work has significantly reduced onboarding costs by 96%, demonstrating my ability to deliver cost-efficient AI solutions.
My expertise includes developing high-performance NLP, OCR, and computer vision pipelines, achieving less than 50 ms OCR latency and 99.97% accuracy on biometric workloads involving 49 million images. I excel at translating complex business requirements into deployable AI solutions with measurable impact. I am fluent in English and open to relocating to Belgium to further my career.
Technically, I am proficient in Python and SQL, and skilled in machine learning frameworks such as PyTorch, TensorFlow, and Scikit-learn. I have extensive experience with transformers, LLM inference, RAG architectures, named entity recognition, OCR, and computer vision. I am also adept at data platforms including PostgreSQL, MongoDB, Elasticsearch, ChromaDB, and Milvus.
In terms of infrastructure and MLOps, I have architected Infrastructure as Code workflows using GitLab CI/CD and Ansible, automated provisioning of GPU clusters, and migrated monolithic systems to containerized microservices on AWS EC2 with SQS task queuing. I have implemented observability solutions using Prometheus, Grafana, and CloudWatch to monitor model drift and latency.
I have led the development of GenAI pipelines processing multimodal inputs to automate issue assignment, drastically reducing diagnosis times. Additionally, I have optimized open-source LLM deployments on NVIDIA RTX 5090 GPUs and engineered graph-based retrieval pipelines to handle massive knowledge bases with zero hallucination. My work on biometric AI systems includes fine-tuning face recognition models on large datasets and optimizing deep learning models for real-time edge deployment.
My freelance experience includes backend engineering for drone photogrammetry pipelines and developing automotive in-cabin AI systems with gesture and driver activity recognition. I am passionate about leveraging AI to solve complex real-world problems and continuously improving engineering productivity and system scalability.
GPA: 18.04 / 20 — Tehran, Iran
GPA: 15.8 / 20 — Mashhad, Iran
Led MLOps and infrastructure automation by architecting Infrastructure as Code workflows using GitLab CI/CD and Ansible to automate GPU cluster provisioning, reducing setup time from days to hours. Migrated monolith to containerized AWS EC2 microservices with SQS task queuing ensuring 99.9% uptime. Established monitoring with Prometheus, Grafana, and CloudWatch for model drift and latency. Built GenAI pipeline processing multimodal inputs to auto-assign Jira issues, reducing diagnosis time from 8 hours to 20 minutes. Scaled ParsChat support ecosystem to handle 20,000+ daily messages for e-commerce and customer support automation. Optimized deployment of QWEN models using vLLM on NVIDIA RTX 5090 infrastructure achieving 60 messages per minute throughput. Re-engineered retrieval pipeline into graph-based architecture with intent-based routing to process up to 10 million characters with zero hallucination. Reduced chatbot delivery time from 1 week to 5 minutes and cut setup costs from $2.50 to $0.10 per user. Fine-tuned face recognition models on 49M images using NVIDIA A6000 GPUs achieving 99.97% accuracy. Optimized deep learning models for CPU-based edge architecture achieving real-time processing. Deployed services using FastAPI, gRPC, Celery, and Redis for scalable task management.
As Backend Engineer for DIPAL (Sep 2025 – Mar 2026), engineered scalable AWS backend services automating structure-from-motion and multi-view-stereo pipelines for photogrammetry toolkit with REST API integration for GIS workflows. As ML/MLOps Engineer for SensiGesture (Feb 2022 – Jul 2022), curated proprietary in-cabin dataset with 350,000+ annotated RGB frames; trained MobileNetV3 models recognizing 22 gestures and 20 driver activity classes for real-time inference. Built end-to-end pipeline for data versioning, automated evaluation, and on-device deployment on automotive-grade edge hardware.
Jobicy
592 professionals pay to access exclusive and experimental features on Jobicy
Free
USD $0/month
For people just getting started
Plus
USD $8/month
Everything in Free, and: