The Art of Translation
What if your model could just look where it needed to? Exploring attention mechanisms in sequence-to-sequence learning and how they revolutionize machine translation.
Hello! I'm a Master's student at Georgia Tech advised by Prof. Bo Dai. My research currently lies in Long Context Modeling and Reinforcement Learning.
Computer Science and Mathematics at Georgia Tech. Focused on Machine Learning and Mathematical Modeling.
Most recently I led the software stack at Swish Robotics. Prior to that I worked on Fraud Detection Models and Infrastructure at Credit Karma and Point of Sale (PoS) systems at NCR. Indian National Math Olympiad finalist and a Top 300 rank on the Putnam Math Contest.
What if your model could just look where it needed to? Exploring attention mechanisms in sequence-to-sequence learning and how they revolutionize machine translation.
Benchmarking Flash Attention v1 and v2 in Triton against a naive PyTorch implementation of Scaled Dot Product Attention and Multi Headed Attention.
The Exploration vs. Exploitation Dilemma and a brief introduction to Reinforcement Learning
Working on long-context models and memory agents
Worked with Prof. Clio Andris on geographic visualizations and spatial information theory.
A Design Space of Node Placement Methods for Geospatial Network Visualizations.
Implemented LoRA adapters on a 7B Mistral Model and distilled into a smaller student.
Benchmarking Flash Attention v1 and v2 in Triton against a naive PyTorch implementation of Scaled Dot Product Attention and Multi Headed Attention.
Trained a sequence-to-sequence model with and without the attention mechanism to translate natural language to Python snippets.