I am an ML/DevOps engineer with over 5 years of experience specializing in development and infrastructure. My expertise lies in Kubernetes, PyTorch, GitOps, Docker, Ansible, Terraform, and GitLab CI/CD. I focus on building robust AI/ML services and developer-friendly infrastructure to support scalable and efficient workflows.
Throughout my career, I have been responsible for automated deployment, product localization for internal use, and creating clear user documentation. I have a strong background in managing complex infrastructure environments and deploying AI/ML applications in cloud environments such as Yandex Cloud.
In my recent role as Lead ML/DevOps Engineer at Magnit Tech, I deployed monitoring systems for high-performance LLM inference servers and optimized Kubernetes clusters using Helm and observability tools like Prometheus, Grafana, and Loki. I also built automated infrastructure workflows using Infrastructure as Code and managed secure authentication and configuration encryption.
Previously, I worked at Eureka where I developed NVIDIA-based solutions for production workloads, improved neural network training efficiency, and automated developer environment provisioning. I also authored comprehensive documentation to support ML/DevOps workflows.
Earlier in my career, I served as a Python Developer / DevOps Engineer at KuPi LLC, where I designed and implemented infrastructure on Yandex Cloud, optimized frontend-backend communication, and developed interactive web applications with secure admin access.
I am fluent in Russian and have intermediate proficiency in English. I am passionate about leveraging technology to create scalable, efficient, and secure AI/ML infrastructure solutions.
IT division of Magnit, responsible for maintaining complex infrastructure supporting 30,000+ stores across Russia and abroad.
Tech Stack: PyTorch, TensorFlow, Docker, Terraform, GitLab CI/CD, Kubernetes, Helm, Python, Bash, Linux, Nginx, Prometheus, Grafana, Loki, Yandex Cloud
Responsibilities:
• Deployed monitoring systems for high-performance LLM inference servers using vLLM and Triton Inference Server.
• Optimized Kubernetes clusters using Helm and observability tools (Prometheus, Grafana, Loki) for real-time metrics, logs, and alerting.
• Built automated infrastructure workflows using Infrastructure as Code (IaC), Makefiles, and CI/CD pipelines.
• Managed AI/ML application deployments in Kubernetes on Yandex Cloud.
• Implemented secure authentication via LDAP and encrypted sensitive configurations using SOPS.
• Set up and maintained a ClearML lab for AI/ML experiment tracking, model versioning, and workflow automation.
Key Achievements:
• Deployed comprehensive monitoring alongside cluster setup.
• Fully rewrote the RAG Helm chart to meet internal standards.
• Developed ClearML sessions for virtual machines used in ML workflows.
A company focused on research, development, implementation, and support of information systems.
Tech Stack: PyTorch, TensorFlow, Docker, Terraform, GitLab CI/CD, Kubernetes, Helm, Python, Bash, Linux, Nginx, NVIDIA, Ansible, MLPerf, OpenNebula
Responsibilities:
• Selected, developed, and deployed NVIDIA-based solutions (e.g., OpenNebula) for production workloads.
• Improved neural network training efficiency using MIG, Docker Compose, and multithreading.
• Enhanced MLPerf deep learning benchmarks; adapted HPL/NVIDIA test suites via Bash/Python scripts; localized tooling.
• Set up local GitLab and PyPI mirrors to reduce bandwidth costs for large model downloads.
• Automated developer environment provisioning via GitLab CI/CD using Ansible, Terraform, and Python templates.
• Introduced Deep Learning benchmark tests to help developers optimize training configurations.
• Authored comprehensive documentation for ML/DevOps workflows.
A company specializing in security and construction automation.
Tech Stack: AWS Cloud, Docker, Terraform, Python, Bash, Linux, Nginx, Flask, BeautifulSoup4, HTML, CSS, SocketIO
Project: Interactive map website for restaurants in Saint Petersburg (hosted on Flage).
Responsibilities:
• Designed and implemented IaC-based infrastructure on Yandex Cloud using Terraform and Ansible.
• Optimized frontend-backend communication for better performance and data consistency.
• Managed domain registration and hosting setup for successful project launches.
• Built a GitLab CI/CD pipeline for reliable, automated deployments.
• Developed a data parser with BeautifulSoup4 to extract and store data in CSV format.
• Implemented Flask-Admin authentication for secure admin access.
• Added interactive map search functionality to enhance user experience.
Key Achievements:
• Enabled third-party login via Git and Facebook.
• Delivered a fully functional interactive map for end users.
• Provided content management capabilities for site administrators.
Jobicy
592 professionals pay to access exclusive and experimental features on Jobicy
Free
USD $0/month
For people just getting started
Plus
USD $8/month
Everything in Free, and: