I am a Senior AI/ML engineer with more than 10 years of experience building scalable systems and AI-driven products at infrastructure scale. My background centers on Java, Python, distributed systems, and delivering production-grade machine learning solutions that support high-volume, mission-critical environments.
I have designed and shipped LLM-powered chatbots, real-time data pipelines, and microservices that handle millions of daily transactions. My work has focused on combining backend engineering, machine learning, and MLOps to create reliable systems with strong performance, observability, and scalability.
In my recent roles, I architected production chatbot backends, built end-to-end RAG pipelines, and developed fraud detection platforms using Kafka, Spark, PyTorch, Kubernetes, and cloud services. I have also worked extensively on model optimization, inference acceleration, and automated deployment workflows.
I enjoy solving complex technical challenges across the full stack of AI systems, from data ingestion and feature generation to model deployment and monitoring. I have experience with Spring Boot, FastAPI, LangChain, Hugging Face, vector databases, and modern cloud-native tooling.
I also bring strong leadership and collaboration skills. I have worked closely with product, data engineering, and algorithm teams to translate business needs into technical solutions, and I have mentored junior engineers on backend architecture, code quality, and MLOps best practices.
My career has included roles at ArisGlobal, TaskUs, and Deloitte, where I consistently delivered improvements in latency, throughput, reliability, and operational efficiency. I am based in Zagreb, Croatia, and I speak Croatian and English.
Architected and deployed a production LLM-powered chatbot backend using Java Spring Boot and Python integration layer, serving millions of daily user interactions with sub-500ms latency. Built an end-to-end RAG pipeline for intelligent search and Q&A using LangChain, Hugging Face embeddings, and vector databases. Designed a high-throughput real-time fraud detection platform leveraging Kafka streams, Apache Spark, and PyTorch models. Architected MLOps infrastructure using Kubeflow, MLflow, Airflow, Docker, and Kubernetes. Optimized inference performance using TensorRT, ONNX conversion, quantization, and pruning. Built distributed data engineering pipelines with Apache Spark and Kafka. Delivered an LLMOps platform with standardized deployment templates and automated evaluation dashboards. Engineered microservices using Spring Boot and FastAPI with event-driven architecture and Redis caching. Implemented observability stack with Prometheus, Grafana, Elasticsearch, and drift detection. Led cross-functional collaboration and mentored junior engineers.
Developed LLM-powered chatbot backend services using LangChain, Hugging Face models, and vector search integration. Built a hybrid recommendation engine combining PyTorch embeddings, ML ranking models, and content-based filtering. Designed real-time data pipelines using Kafka and Apache Spark. Implemented LSTM-based anomaly detection models and deployed them with MLflow, Docker, and Kubernetes. Fine-tuned domain-specific LLMs using LoRA/QLoRA and optimized inference latency through ONNX and TensorRT. Deployed ML services to AWS SageMaker and GCP Vertex AI and implemented drift detection dashboards. Collaborated with backend teams to integrate ML inference outputs into microservices.
Developed and optimized machine learning models using Python, scikit-learn, PyTorch, and TensorFlow. Built data preprocessing and feature engineering pipelines with Pandas, NumPy, and Spark. Created inference services using FastAPI and Docker. Performed model evaluation, hyperparameter tuning, and performance tracking using MLflow. Collaborated with data engineers, backend developers, and product teams to design end-to-end ML solutions.
Jobicy
617 professionals pay to access exclusive and experimental features on Jobicy
Free
USD $0/month
For people just getting started
Plus
USD $8/month
Everything in Free, and: