Projects

Research projects, open-source implementations, and production systems I've built.

LLM Alignment Lab

From-scratch PyTorch implementation of a complete LLM alignment pipeline combining QLoRA (NF4 4-bit quantization with double quantization) and DPO (Direct Preference Optimization). Two-stage pipeline: Supervised Fine-tuning followed by preference alignment. Includes a 34-test validation suite on quantization fidelity and loss correctness. No dependency on HuggingFace PEFT, TRL, or bitsandbytes.

PyTorchQLoRADPOLLM AlignmentNF4 Quantization

DoRA: Weight-Decomposed Low-Rank Adaptation

View on GitHub →

From-scratch PyTorch implementation of DoRA (ICML 2024 Oral, NVIDIA), decomposing pretrained weights into independent magnitude and direction components for fine-tuning. Supports LLaMA, Mistral, Gemma, GPT-2, OPT, and Phi architectures. Adds only 0.01% parameters beyond LoRA while consistently improving performance. Includes weight merging for zero-overhead inference.

PyTorchDoRALoRAFine-tuningICML 2024