Professional Research
Sync Labs
As a Research Scientist at Sync Labs, I focus on building robust, efficient and state-of-the-art models for zero-shot video editing, lip-sync generation, and speech-driven facial animation. I work with flow-matching architectures, with a specific focus on optimizing inference efficiency, improving visual quality, and developing rigorous evaluation metrics for generative video. I also work on video super-resolution, TTS models, speech editing and language modeling.
Samsung Research
I worked with the AI Camera Team of Visual Intelligence Division at Samsung R&D Institute India, Bangalore (SRI-B), where I developed and optimized deep learning models for action recognition on edge devices. These models have been integrated into Samsung's flagship Galaxy S24 series in its Single Take feature.
Tata Motors
I worked at the Application Engineering and also at Brake Department at the Engineering Research Center (ERC) of Tata Motors. I developed a comprehensive toolkit to analyse vehicle stability when subjected to various driving conditions and application components. I also developed an automated brake thermal analysis software for identifying potential brake issues during extreme driving scenarios.
Academic Research
PhD Thesis
Development of Resource-efficient architectures for computer vision such as image classification, semantic segmentation, image super-resolution, image inpainting and developing efficient evaluation metrics for generative models.
M.Tech Thesis
Design and control of a lower extremity exoskeleton for rehabilitation.
Recent Publications
Scroll through my recent research publications. Click on any paper to learn more.
Image Quality Metric
FLD+: Data-efficient Evaluation Metric for Generative Models
ICCV 2025
Histopathology Metric
Evaluation Metric for Quality Control and Generative Models in Histopathology Images
ISBI 2025
Super Resolution
WaveMixSR-V2: Enhancing Super-Resolution with Higher Efficiency
AAAI 2025
Image Inpainting
WavePaint: A Resource-Efficient Token-Mixer for Self-Supervised Inpainting
ICCV 2025
Image Quality Metric
Which Backbone to Use: A Resource-efficient Domain Specific Comparison for Computer Vision
TMLR 2025
Facial Recognition
PawFACS: Leveraging Semi-Supervised Learning for Pet Facial Action Recognition
BMVC 2024
Medical Imaging
Magnification Invariant Medical Image Analysis: A Comparison of Convolutional Networks, Vision Transformers, and Token Mixers
BIOIMAGING 2024
Domain Adaptation
Adversarial Transport Terms for Unsupervised Domain Adaptation
ICPR 2024
Federated Learning
FLeNS: Federated Learning with Enhanced Nesterov-Newton Sketch
IEEE BigData 2024
Image Quality Metric
EDSNet: Efficient-DSNet for Video Summarization
arxiv 2024
GNN Architecture
Heterogeneous Graphs Model Spatial Relationships Between Biological Entities for Breast Cancer Diagnosis
MICCAI 2023
Super Resolution
WaveMixSR: A Resource-efficient Neural Network for Image Super-resolution
WACV 2024
Normalizing Flow Metric
Normalizing Flow Based Metric for Image Generation
arxiv 2024
WaveMix-based
Resource-efficient Image Inpainting
ICLR 2023
Image Quality Metric
Resource-Efficient Hybrid X-Formers for Vision
WACV 2022
Image Quality Metric
WaveMix: A Resource-efficient Neural Network for Image Analysis
arxiv 2022
Image Quality Metric
Convolutional Xformers for Vision
arxiv 2022
Humor Detection
"So You Think You're Funny?": Rating the Humour Quotient in Standup Comedy
EMNLP 2021
Image Quality Metric
FLD+: Data-efficient Evaluation Metric for Generative Models
ICCV 2025
