Sandadi Vithin Reddy
Building intelligent systems at the intersection of large language models, retrieval-augmented generation, and production-grade ML infrastructure.
Building the Future with AI
I'm Sandadi Vithin Reddy, an AI/ML Engineer based in Dallas, TX, passionate about transforming raw research into production-ready intelligent systems.
At Accenture, I architected and deployed LLM-powered applications that reduced manual processing time by over 60%, integrating retrieval-augmented generation pipelines for enterprise knowledge management.
My focus areas include LLM fine-tuning, vector database design, agentic workflows, and ML system architecture. I thrive in environments where research meets scale.
LLM Expertise
Deep hands-on experience with GPT-4, Claude, LLaMA, and fine-tuning workflows.
RAG Systems
End-to-end RAG pipelines with vector stores, hybrid search, and re-ranking.
Production ML
Shipping models to production with MLflow, FastAPI, Docker, and cloud platforms.
Collaboration
Cross-functional work with data engineers, product teams, and enterprise clients.
Skills & Expertise
AI / Machine Learning
RAG & Vector Systems
Engineering & Infra
Data & Databases
Also proficient in
Work Experience
AI/ML Engineer
- Architected and deployed an enterprise RAG pipeline for internal knowledge management, reducing document retrieval time by 65% and increasing answer accuracy to 91%.
- Fine-tuned LLaMA-2 and Mistral models using QLoRA on domain-specific datasets, achieving a 40% improvement over baseline on client benchmarks.
- Built multi-agent orchestration workflows using LangChain and AutoGen for automated report generation, saving 200+ analyst hours per month.
- Designed a real-time ML inference API using FastAPI + Docker that handles 5k+ requests/min with 99.9% uptime on AWS ECS.
- Led end-to-end MLOps modernization, implementing MLflow experiment tracking and CI/CD pipelines that cut model deployment time from 3 days to 4 hours.
- Mentored junior engineers and delivered internal LLM workshops to 50+ consultants, accelerating AI adoption across practice areas.
ML Research Assistant
- Researched transformer architectures for NLP tasks including text classification, summarization, and question answering.
- Implemented and evaluated BERT, RoBERTa, and T5 variants on benchmark datasets, publishing findings to department repository.
- Built data collection and preprocessing pipelines for large-scale text corpora using Python and Spark.
- Collaborated with professors on grant proposal for NSF-funded NLP research initiative.
Featured Projects
Enterprise RAG System
Production-Grade Retrieval-Augmented Generation
A full-stack RAG pipeline designed for enterprise knowledge bases. Ingests PDFs, Word docs, and web pages; chunks and embeds with custom strategies; retrieves via hybrid search (BM25 + dense vectors) and re-ranks with cross-encoders before generating grounded answers.
- ›Hybrid retrieval: BM25 sparse + dense vector search with late fusion
- ›Context-aware chunking with metadata-aware overlap strategy
- ›LLM re-ranking pipeline using cross-encoder models for precision boost
- ›Streaming response API with source citation and confidence scores
- ›Evaluation harness using RAGAS (faithfulness, relevancy, correctness)
- ›91% answer accuracy on internal enterprise benchmark dataset
LLM Fine-Tuning Pipeline
QLoRA Fine-tuning Framework
Modular fine-tuning framework for instruction-following LLMs using QLoRA on consumer and cloud GPUs. Includes data prep, training, evaluation, and GGUF export for local deployment.
AI Document Intelligence
Intelligent Document Processing
Automated document processing pipeline that extracts structured data from unstructured PDFs using vision models and NLP. Handles invoices, contracts, and research papers with high accuracy.
Multi-Agent Chatbot
Agentic AI Assistant
Conversational multi-agent system with tool use, memory, and routing. Uses AutoGen to orchestrate specialized agents for research, analysis, and action planning with persistent conversation history.
Let's Connect
Open to AI/ML engineering roles, research collaborations, and consulting opportunities.
Get in touch
Currently open to full-time AI/ML engineering roles, research positions, and select consulting projects.