Ciao, I'm Swathi.
Applied AI engineer building agentic systems and autonomous tooling.
Research Fellow @ Microsoft Research

Experience
Agentic system combining AST-based static analysis with LLM-driven classification for Rust code repair. 80-95% recall across 150+ crates and 300k+ LoC.
Pointer ownership inference for CβRust using call graphs, topological propagation, and RL-based translation with backtracking. Reduced manual annotation ~80%.
Implemented on Model Context Protocol (MCP), integrated with GitHub Copilot and Claude Code.
Featured in Rust@M365 newsletter.
Invoice understanding with PaddleOCR + LLM extraction (92%+ accuracy). SVM email classifier at 99.6% accuracy.
Hybrid RAG architecture with SQL reasoning β 95%+ answer accuracy for e-commerce social media platform.
Built recommendation engine for commerce-focused social platform; improved engagement by 40% and CTR by 25%.
Django backend serving 2M+ company records for finance chatbot. Deployed ML pipelines via Docker + AWS.
Vue.js + Flask application with 30% scheduling accuracy improvement, saving 15+ hours per week.
Streamlined API testing with Postman, reducing system errors by 40% and increasing uptime 20%.
Projects
Autonomous Code Repair
Multi-agent system that detects and fixes coding guideline violations in Rust codebases. Combines AST-based static analysis to extract structural code features with LLM-driven classification.
Designed a multi-agent architecture with an orchestrator delegating to specialized subagents for violation detection, classification, and fix generation β operating in an autonomous loop with build/test validation. Achieved 80-95% recall (vs. 10β50% with vanilla LLM) and <10% missed violations; deployed across 150+ Rust crates and 300k+ LoC.
C β Rust Ownership Inference
Pointer ownership inference system analyzing semantics (singleton vs. array vs. null-terminated) by constructing call graphs and propagating annotations in reverse topological order.
Built type summarization pipeline extracting field-level ownership patterns from C structs via usage-site analysis. Implemented RL-based translation with backtracking: exploration stack of competing approaches, LLM-as-judge scoring, and reward-based approach selection. Reduced manual type annotation effort ~80%.
Invoice Understanding System
Production ML pipeline for automated invoice understanding using OCR and LLM-based structured extraction.
Built an end-to-end invoice processing system combining PaddleOCR with LLM-based field extraction, achieving 92%+ extraction accuracy. Trained and optimized an SVM (RBF kernel) classifier for invoice email detection with 99.6% accuracy and 99%+ precision/recall. Designed scalable ML inference pipelines and deployed them via Docker and AWS for production use.
Financial Intelligence Backend
Scalable backend serving financial intelligence queries across millions of company records.
Architected and deployed a Django-based backend serving financial data across 2M+ company records. Implemented optimized database querying with PostgreSQL and integrated secure authentication using Auth0. Designed APIs enabling LLM-powered financial query responses over structured databases.
LLM GMeet Assistant
Chrome extension that captures live meeting transcripts and generates summaries and action items.
Built a browser extension using TypeScript, React, and Chrome APIs to capture live meeting transcripts and process them with LLMs for summarization and action-item extraction. Implemented scenario-aware prompting to generate context-specific outputs for meetings.
Distributed File Sharding
Client-server architecture with consistent hashing for fault-tolerant file partitioning and retrieval.
98% sharding efficiency by evenly distributing files across 10 servers using MD5 hash-based sharding. Reduced data retrieval time by 75% compared to a non-sharded system, with average retrieval of 3ms per file.
Skills
Languages
ML & AI
Frameworks
Infrastructure
Extracurriculars
Education
Manipal Institute of Technology
B.Tech CS & Engineering (Minor: Data Science) Β· CGPA 9.26 Β· 2020β2024
Computer Networks, Database Management, Distributed Systems, Data Science, Compiler Design, Operating System, Computer Vision, Data Structures and Algorithms