Resume | Manish Nagaraj

General Information

Full Name	Manish Nagaraj
Email	mnagara@purdue.edu
Phone	+1-765-701-7970
LinkedIn	http://linkedin.com/in/m-nagaraj
GitHub	https://github.com/manishnagaraj
Website	https://manishnagaraj.github.io
Location	West Lafayette, IN, USA

Education

May 2026
PhD in Electrical & Computer Engineering

Purdue University, West Lafayette, IN
- Research Areas - Data Efficiency, Efficient Fine-tuning, Federated Learning, Large Language Models
- Research Advisor - Prof. Kaushik Roy
May 2019
MS in Electrical & Computer Engineering

Purdue University, West Lafayette, IN
- Thesis - Energy Efficient Byzantine Agreement Protocols for Cyber-Physical Resilience
May 2017

Bachelor of Engineering in Electronics and Communications

PES Institute of Technology and Science, Bangalore, India

Research Projects

2025
TRIM - Token-wise Attention-Derived Saliency for Data-Efficient Instruction Tuning

Ph.D. Dissertation, Purdue University
- Proposed a forward-only, gradient-free data selection method that builds attention-based token “fingerprints” to score instruction-tuning examples.
- Selected instruction examples that most improved performance on reasoning and commonsense benchmarks for open-source LLMs under small data budgets.
- TRIM-selected coresets outperformed state-of-the-art coreset baselines by up to ~9% on downstream tasks and, in some settings, matched or surpassed full-data fine-tuning.
- Demonstrated computational scalability- in a shared coreset selection setup, TRIM achieved higher downstream accuracy while running ~2.6× faster than the best gradient-based baseline.
- Under Review
2025
Coresets from Trajectories - Selecting Data via Correlation of Loss Differences

Ph.D. Dissertation, Purdue University
- Introduced Correlation of Loss Differences (CLD), a gradient-free coreset metric that ranks training points by how closely their loss-change trajectories correlate with a small validation set.
- Established a convergence guarantee- training on CLD-selected coresets tracks full-data optimization up to a provable error bound.
- On CIFAR-100 and ImageNet-1k, achieved state-of-the-art coreset selection performance, typically within ~1% of more computationally expensive methods across subset sizes.
- Demonstrated cross-architecture transfer- coresets selected with small proxy CNNs generalized to larger CNN and vision-transformer models with <1% accuracy drop.
- Accepted for publication at TMLR 2025.
2024
TOFU - Federated Learning with Data and Communication Efficiency

Ph.D. Research, Purdue University
- Proposed a federated learning scheme that encodes each client's model update as gradients on a small synthetic dataset, transmitting proxies instead of full weight updates.
- On MNIST and CIFAR-10, achieved up to ~4x and ~6.6x lower communication cost than a standard federated learning baseline, while maintaining comparable final accuracy.
- Enhanced privacy against gradient inversion attacks- proxy data and reconstructed inputs resemble noise and fail to reveal meaningful client information.
- Studied accuracy-communication-privacy trade-offs by varying proxy size and update frequency, yielding practical guidelines for bandwidth-constrained federated deployments.
- Published in IEEE Access 2024.
2023
DOTIE - Energy-Efficient Object Detection Using Event Cameras

Ph.D. Research, Purdue University
- Developed an event-camera object detection pipeline using a lightweight spiking layer plus density-based clustering to isolate moving objects without frame reconstruction.
- On the MVSEC outdoor driving dataset, more than doubled the mean IoU over prior event-based methods and achieved near-perfect foreground detection.
- Achieved roughly six orders of magnitude lower energy consumption and significantly reduced latency compared to a YOLO CNN baseline on the same data.
- Maintained performance across diverse scenes with minimal retuning, enabling deployment in low-power autonomous navigation and neuromorphic systems.
- Presented at ICRA 2023 and CVPR 2023 Workshops.

Industry Experience

Summer 2023
Research Intern, Integrated Systems Team

Latent AI, Skillman, NJ
- Prototyped an unsupervised anomaly detection pipeline for automated target recognition using state-of-the-art methods via Anomalib on MVTEC-AD and internal sensor datasets.
- Built an interactive labeling tool combining classical computer vision and SAM to generate pixel-wise masks on noisy imagery, reducing manual annotation time.
- Exported top models to ONNX and used the Latent AI Efficient Inference Platform (LEIP) to compile and optimize them for edge devices, achieving 4x energy-efficiency gains over unoptimized baselines.
- Summarized experiments and findings in internal documentation and presentations to support ongoing edge inference pipeline development.

Skills

Programming & Scripting	Python, Bash
Platforms	Linux (Ubuntu, Red Hat), SLURM-based GPU clusters
Deep Learning & ML	PyTorch, Hugging Face Transformers, NumPy, pandas, scikit-learn, OpenCV
Distributed & Large-Scale Training	PyTorch DDP, DeepSpeed, TensorBoard
Tools & Packaging	Docker, Git/GitHub, Conda

Relevant Coursework

Artificial Intelligence
Statistical Machine Learning
Random Processes and Probability
Linear Algebra
Computational Models and Algorithms (DSA)
Distributed Computer Systems
Computer Networks

Reviewer Service

IEEE Robotics and Automation Letters (2024)
Transactions on Machine Learning Research (2023 - Present)
Neural Information Processing Systems, NeurIPS (2025 - Present)
AAAI Conference on Artificial Intelligence (2024 - Present)
The IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR (2024 - Present)
International Conference on Machine Learning, ICML (2025 - Present)
International Conference on Learning Representations, ICLR (2024 - Present)

General Information

Education

PhD in Electrical & Computer Engineering

MS in Electrical & Computer Engineering

Bachelor of Engineering in Electronics and Communications

Research Projects

TRIM - Token-wise Attention-Derived Saliency for Data-Efficient Instruction Tuning

Coresets from Trajectories - Selecting Data via Correlation of Loss Differences

TOFU - Federated Learning with Data and Communication Efficiency

DOTIE - Energy-Efficient Object Detection Using Event Cameras

Industry Experience

Research Intern, Integrated Systems Team

Skills

Relevant Coursework

Reviewer Service