Arnav Mardia

About

Hello! I'm a Master's student at Georgia Tech advised by Prof. Bo Dai. My research currently lies in Long Context Modeling and Reinforcement Learning.

Education

Computer Science and Mathematics at Georgia Tech. Focused on Machine Learning and Mathematical Modeling.

Recent Past

Most recently I led the software stack at Swish Robotics. Prior to that I worked on Fraud Detection Models and Infrastructure at Credit Karma and Point of Sale (PoS) systems at NCR. Indian National Math Olympiad finalist and a Top 300 rank on the Putnam Math Contest.

Blog

The Art of Translation

What if your model could just look where it needed to? Exploring attention mechanisms in sequence-to-sequence learning and how they revolutionize machine translation.

15th June 2025 Read more →

Flash and Furious

Benchmarking Flash Attention v1 and v2 in Triton against a naive PyTorch implementation of Scaled Dot Product Attention and Multi Headed Attention.

9th July 2025 Read more →

The Last Pull

The Exploration vs. Exploitation Dilemma and a brief introduction to Reinforcement Learning

24th July 2025 Read more →

Projects

LoRA x Distillation

Implemented LoRA adapters on a 7B Mistral Model and distilled into a smaller student.

GitHub

Flash Attention

Benchmarking Flash Attention v1 and v2 in Triton against a naive PyTorch implementation of Scaled Dot Product Attention and Multi Headed Attention.

GitHub 📝 Blog

Sequence-to-Sequence Learning with Bahdanau Attention

Trained a sequence-to-sequence model with and without the attention mechanism to translate natural language to Python snippets.

📝 Blog

Tensor Processing Unit (TPUs) in SystemVerilog

Replicating the Google TPU V1 paper on a small scale in SystemVerilog.