Ciao, I'm Swathi.

Applied AI engineer building agentic systems and autonomous tooling.

Research Fellow @ Microsoft Research

Swathi's desk — hover over items to explore

Say hi 👋

Experience

Projects

Extracurriculars

Education

Skills

Hover over items to explore · Click to jump ↓Tap items to explore ·

Experience

Agentic system combining AST-based static analysis with LLM-driven classification for Rust code repair. 80-95% recall across 150+ crates and 300k+ LoC.

Pointer ownership inference for C→Rust using call graphs, topological propagation, and RL-based translation with backtracking. Reduced manual annotation ~80%.

Implemented on Model Context Protocol (MCP), integrated with GitHub Copilot and Claude Code.

Featured in Rust@M365 newsletter.

Invoice understanding with PaddleOCR + LLM extraction (92%+ accuracy). SVM email classifier at 99.6% accuracy.

Hybrid RAG architecture with SQL reasoning — 95%+ answer accuracy for e-commerce social media platform.

Built recommendation engine for commerce-focused social platform; improved engagement by 40% and CTR by 25%.

Django backend serving 2M+ company records for finance chatbot. Deployed ML pipelines via Docker + AWS.

Vue.js + Flask application with 30% scheduling accuracy improvement, saving 15+ hours per week.

Streamlined API testing with Postman, reducing system errors by 40% and increasing uptime 20%.

Projects

Autonomous Code Repair

Multi-agent system that detects and fixes coding guideline violations in Rust codebases. Combines AST-based static analysis to extract structural code features with LLM-driven classification.

Designed a multi-agent architecture with an orchestrator delegating to specialized subagents for violation detection, classification, and fix generation — operating in an autonomous loop with build/test validation. Achieved 80-95% recall (vs. 10–50% with vanilla LLM) and <10% missed violations; deployed across 150+ Rust crates and 300k+ LoC.

Agentic AIRustMCPMulti-AgentAST

C → Rust Ownership Inference

Pointer ownership inference system analyzing semantics (singleton vs. array vs. null-terminated) by constructing call graphs and propagating annotations in reverse topological order.

Built type summarization pipeline extracting field-level ownership patterns from C structs via usage-site analysis. Implemented RL-based translation with backtracking: exploration stack of competing approaches, LLM-as-judge scoring, and reward-based approach selection. Reduced manual type annotation effort ~80%.

Type SystemsGraph AlgorithmsRLAST

Invoice Understanding System

Production ML pipeline for automated invoice understanding using OCR and LLM-based structured extraction.

Built an end-to-end invoice processing system combining PaddleOCR with LLM-based field extraction, achieving 92%+ extraction accuracy. Trained and optimized an SVM (RBF kernel) classifier for invoice email detection with 99.6% accuracy and 99%+ precision/recall. Designed scalable ML inference pipelines and deployed them via Docker and AWS for production use.

PaddleOCRScikit-learnLLMsAWSDocker

Financial Intelligence Backend

Scalable backend serving financial intelligence queries across millions of company records.

Architected and deployed a Django-based backend serving financial data across 2M+ company records. Implemented optimized database querying with PostgreSQL and integrated secure authentication using Auth0. Designed APIs enabling LLM-powered financial query responses over structured databases.

DjangoPostgreSQLAuth0APIsAWS

LLM GMeet Assistant

Chrome extension that captures live meeting transcripts and generates summaries and action items.

Built a browser extension using TypeScript, React, and Chrome APIs to capture live meeting transcripts and process them with LLMs for summarization and action-item extraction. Implemented scenario-aware prompting to generate context-specific outputs for meetings.

TypeScriptReactChrome APIsLLMs

Distributed File Sharding

Client-server architecture with consistent hashing for fault-tolerant file partitioning and retrieval.

98% sharding efficiency by evenly distributing files across 10 servers using MD5 hash-based sharding. Reduced data retrieval time by 75% compared to a non-sharded system, with average retrieval of 3ms per file.

Distributed SystemsConsistent HashingPython

View on GitHub →

Skills

Languages

Python

Rust

C/C++

TypeScript

ML & AI

LLMs

Agentic AI

Multi-Agent Systems

RAG

AST Analysis

Frameworks

LangChain

FAISS

PaddleOCR

Django

FastAPI

Infrastructure

Model Context Protocol

PostgreSQL

Docker

AWS

Extracurriculars

Running — signing up for marathons to remember pain is temporaryPublic speaking & debatesNetflix — competitive long-form content consumptionBeaches True crime documentariesLearning to surf Learning to tuft — including the slightly unhinged duck rug

Education

Manipal Institute of Technology

B.Tech CS & Engineering (Minor: Data Science) · CGPA 9.26 · 2020—2024

Computer Networks, Database Management, Distributed Systems, Data Science, Compiler Design, Operating System, Computer Vision, Data Structures and Algorithms

Ciao, I'm Swathi.

Experience

Research Fellow · Microsoft Research

Machine Learning Intern · FischerJordan

Software Development Intern · SAP Labs India

Projects

Autonomous Code Repair

C → Rust Ownership Inference

Invoice Understanding System

Financial Intelligence Backend

LLM GMeet Assistant

Distributed File Sharding

Skills

Languages

ML & AI

Frameworks

Infrastructure

Extracurriculars

Education

Manipal Institute of Technology