Resume
Latest Resume of Manish Nagaraj
General Information
| Full Name | Manish Nagaraj |
| mnagara@purdue.edu | |
| Phone | +1-765-701-7970 |
| http://linkedin.com/in/m-nagaraj | |
| GitHub | https://github.com/manishnagaraj |
| Website | https://manishnagaraj.github.io |
| Location | West Lafayette, IN, USA |
Education
-
May 2026 PhD in Electrical & Computer Engineering
Purdue University, West Lafayette, IN - Research Areas - Data Efficiency, Efficient Fine-tuning, Federated Learning, Large Language Models
- Research Advisor - Prof. Kaushik Roy
-
May 2019 MS in Electrical & Computer Engineering
Purdue University, West Lafayette, IN - Thesis - Energy Efficient Byzantine Agreement Protocols for Cyber-Physical Resilience
-
May 2017 Bachelor of Engineering in Electronics and Communications
PES Institute of Technology and Science, Bangalore, India
Research Projects
-
2025 TRIM - Token-wise Attention-Derived Saliency for Data-Efficient Instruction Tuning
Ph.D. Dissertation, Purdue University - Proposed a forward-only, gradient-free data selection method that builds attention-based token “fingerprints” to score instruction-tuning examples.
- Selected instruction examples that most improved performance on reasoning and commonsense benchmarks for open-source LLMs under small data budgets.
- TRIM-selected coresets outperformed state-of-the-art coreset baselines by up to ~9% on downstream tasks and, in some settings, matched or surpassed full-data fine-tuning.
- Demonstrated computational scalability- in a shared coreset selection setup, TRIM achieved higher downstream accuracy while running ~2.6× faster than the best gradient-based baseline.
- Under Review
-
2025 Coresets from Trajectories - Selecting Data via Correlation of Loss Differences
Ph.D. Dissertation, Purdue University - Introduced Correlation of Loss Differences (CLD), a gradient-free coreset metric that ranks training points by how closely their loss-change trajectories correlate with a small validation set.
- Established a convergence guarantee- training on CLD-selected coresets tracks full-data optimization up to a provable error bound.
- On CIFAR-100 and ImageNet-1k, achieved state-of-the-art coreset selection performance, typically within ~1% of more computationally expensive methods across subset sizes.
- Demonstrated cross-architecture transfer- coresets selected with small proxy CNNs generalized to larger CNN and vision-transformer models with <1% accuracy drop.
- Accepted for publication at TMLR 2025.
-
2024 TOFU - Federated Learning with Data and Communication Efficiency
Ph.D. Research, Purdue University - Proposed a federated learning scheme that encodes each client's model update as gradients on a small synthetic dataset, transmitting proxies instead of full weight updates.
- On MNIST and CIFAR-10, achieved up to ~4x and ~6.6x lower communication cost than a standard federated learning baseline, while maintaining comparable final accuracy.
- Enhanced privacy against gradient inversion attacks- proxy data and reconstructed inputs resemble noise and fail to reveal meaningful client information.
- Studied accuracy-communication-privacy trade-offs by varying proxy size and update frequency, yielding practical guidelines for bandwidth-constrained federated deployments.
- Published in IEEE Access 2024.
-
2023 DOTIE - Energy-Efficient Object Detection Using Event Cameras
Ph.D. Research, Purdue University - Developed an event-camera object detection pipeline using a lightweight spiking layer plus density-based clustering to isolate moving objects without frame reconstruction.
- On the MVSEC outdoor driving dataset, more than doubled the mean IoU over prior event-based methods and achieved near-perfect foreground detection.
- Achieved roughly six orders of magnitude lower energy consumption and significantly reduced latency compared to a YOLO CNN baseline on the same data.
- Maintained performance across diverse scenes with minimal retuning, enabling deployment in low-power autonomous navigation and neuromorphic systems.
- Presented at ICRA 2023 and CVPR 2023 Workshops.
Industry Experience
-
Summer 2023 Research Intern, Integrated Systems Team
Latent AI, Skillman, NJ - Prototyped an unsupervised anomaly detection pipeline for automated target recognition using state-of-the-art methods via Anomalib on MVTEC-AD and internal sensor datasets.
- Built an interactive labeling tool combining classical computer vision and SAM to generate pixel-wise masks on noisy imagery, reducing manual annotation time.
- Exported top models to ONNX and used the Latent AI Efficient Inference Platform (LEIP) to compile and optimize them for edge devices, achieving 4x energy-efficiency gains over unoptimized baselines.
- Summarized experiments and findings in internal documentation and presentations to support ongoing edge inference pipeline development.
Skills
| Programming & Scripting | Python, Bash |
| Platforms | Linux (Ubuntu, Red Hat), SLURM-based GPU clusters |
| Deep Learning & ML | PyTorch, Hugging Face Transformers, NumPy, pandas, scikit-learn, OpenCV |
| Distributed & Large-Scale Training | PyTorch DDP, DeepSpeed, TensorBoard |
| Tools & Packaging | Docker, Git/GitHub, Conda |
Relevant Coursework
- Artificial Intelligence
- Statistical Machine Learning
- Random Processes and Probability
- Linear Algebra
- Computational Models and Algorithms (DSA)
- Distributed Computer Systems
- Computer Networks
Reviewer Service
- IEEE Robotics and Automation Letters (2024)
- Transactions on Machine Learning Research (2023 - Present)
- Neural Information Processing Systems, NeurIPS (2025 - Present)
- AAAI Conference on Artificial Intelligence (2024 - Present)
- The IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR (2024 - Present)
- International Conference on Machine Learning, ICML (2025 - Present)
- International Conference on Learning Representations, ICLR (2024 - Present)