I am a Tech Lead and AI R&D Specialist with more than 5 years of experience in software engineering and machine learning. My work focuses on designing, benchmarking, and deploying production-grade Generative AI systems that create measurable business impact.
I specialize in low-latency multi-agent voice systems, Speech-to-Speech architectures, Retrieval-Augmented Generation (RAG), and high-throughput model hosting. I have hands-on experience with leading tools and frameworks such as vLLM, LangChain, PyTorch, TensorFlow, OpenWebUI, and faster-whisper.
In my recent roles at Claro Brasil, I led the development of real-time AI systems using GPT Real-Time APIs and multi-cloud deployments across AWS Bedrock, Azure AI Foundry, and GCP Vertex AI. I also worked on state management, interruption handling, multi-agent orchestration, and multi-GPU inference pipelines.
As an AI R&D Specialist, I designed automated benchmarking pipelines for frontier open-weight and proprietary models, comparing latency, throughput, safety, and cost. I also built deterministic agentic workflows and enterprise RAG solutions that improved retrieval quality and reduced hallucinations.
Earlier in my career, I worked as a Data Scientist and Software Engineer at AXONDATA, where I developed end-to-end data pipelines and machine learning models for text classification, NER, sentiment analysis, and scalable data processing. This gave me a strong foundation in applied AI and production engineering.
I also have a strong academic background, including a Master’s degree in Electrical and Computer Engineering and research in NLP and topic detection. My thesis work was cited in an IBM US patent, reflecting the practical relevance of my research contributions.
Exchange Program
Spearheaded a production-ready, stateful multi-agent system powered by GPT Real-Time streaming APIs. Orchestrated production GenAI pipelines using Amazon Bedrock and Azure AI Foundry, developed state-management mechanisms for multi-agent handoffs, and guided Speech-to-Speech RAG systems with multi-GPU orchestration. Mentored engineers and promoted testing and continuous deployment best practices.
Designed automated benchmarking pipelines using vLLM for frontier open-weight models and evaluated proprietary models across Azure AI Foundry and AWS Bedrock. Built deterministic agentic workflows with function calling, developed enterprise chatbot and retrieval systems with AnythingLLM and OpenWebUI, and helped define the generative AI technology roadmap.
Developed end-to-end data pipelines and machine learning models in Python using PyTorch, TensorFlow, and Scikit-learn. Built solutions for text classification, named-entity recognition, and sentiment analysis, and orchestrated scalable data applications with Apache Spark and MongoDB.
Jobicy
617 professionals pay to access exclusive and experimental features on Jobicy
Free
USD $0/month
For people just getting started
Plus
USD $8/month
Everything in Free, and: