Sanjay Srinivasa

Member of Technical Staff

I build and fine-tune ML systems — from diffusion models and RAG pipelines to production-scale NLP. Currently at Corsair Gaming, working on generative AI and information retrieval. Previously at Optum, building healthcare AI across NLP, computer vision, and large-scale data pipelines.

Passionate about machine learning and artificial intelligence — always exploring what's next in the space.

Currently

Data Scientist at Corsair Gaming — Milpitas, CA

Fine-tuning Qwen diffusion models via LoRA with split training architectures. Building enterprise RAG systems over 50K+ documents with sub-second latency. Deploying on-prem LLM inference at scale with vLLM and Docker.

Experience

Corsair Gaming Inc.

Milpitas, California

Data Scientist

Dec 2025 — Present
  • Fine-tuning Qwen diffusion models via LoRA with a split training architecture, caching VAE and text encodings to train only the denoiser on precomputed latents.
  • Enabled fine-tuning within per-GPU VRAM limits via FP8 quantization with BF16 compute and gradient checkpointing, reducing memory footprint by 50%.

Data Science Intern

Jun 2025 — Dec 2025
  • Built an enterprise RAG system over 50K+ documents using transformer-based embeddings and metadata-driven retrieval, achieving 95% Recall@5 and sub-second latency.
  • Deployed a scalable on-prem LLM inference system (vLLM, Docker) over a 2M+ vector index with benchmarked throughput and latency.

Optum (UnitedHealth Group)

India

Associate AI/ML Engineer

Jul 2020 — Aug 2024
  • Modeled ensemble methods (XGBoost, Random Forest, Logistic Regression) and validated lift through holdout experiments, improving debt recovery by 70%.
  • Developed a multi-stage NLP pipeline using GPT-3.5, Vicuna, and DistilBERT for intent classification, reducing repeat calls and increasing NPS by 82%.
  • Engineered a PySpark + SQL pipeline on Databricks to process 100K healthcare documents with automated PHI de-identification at 98.6% precision.

Data Scientist

  • Designed DBSCAN-based clustering to detect anomalies in patient claims at 98.5% precision.
  • Fine-tuned summarization models on call transcripts, automating post-call documentation and reducing handling time by 67% across 10M+ transcripts.

Software Engineer

  • Fine-tuned YOLOv5 and built an end-to-end OCR pipeline deployed via ONNX and Triton Inference Server, achieving 93% mAP@0.5.
  • Deployed a Dockerized audio-to-text transcription pipeline on Azure using Speech Services and Jenkins CI/CD, reducing latency by 35%.

Education

University of California, Riverside

Master of Science, Computer ScienceCoursework: ML, NLP & DL — GPA: 3.71

Sept 2024 — Dec 2025

Riverside, USA

R.V. College of Engineering (RVCE)

Bachelor of Engineering, Computer Science

Aug 2016 — Jun 2020

Bangalore, India

Skills

AI & Machine Learning

PyTorchDeep LearningNLPRAGDiffusion ModelsInformation RetrievalSearch Relevance & RankingA/B Testing

Data & Infrastructure

PySparkDatabricksSnowflakeSQLAWS S3DockervLLMVector Databases

Cloud & Tools

Azure (Compute, AzureML)JenkinsONNXTriton Inference Server

Languages

PythonSQLCC++