Overview
People
Research
Papers
Courses
Sponsorship
New Papers at CVPR 2026 !
(27 papers)
DeltaQuant: 4-bit Video Diffusion Models with Spatiotemporal Delta Smoothing
Xingyang Li, Samuel Tesfai, Zhekai Zhang, Haocheng Xi, Shuo Yang, Lvmin Zhang, Yufei Sun, Kelly Peng, Maneesh Agrawala, Ion Stoica, Kurt Keutzer, Jun-Yan Zhu, Song Han, Yujun Lin, Muyang Li
Unified Spherical Frontend: Learning Rotation-Equivariant Representations of Spherical Images from Any Camera
PDF
Mukai Yu, Mosam Dabhi, Liuyue Xie, Sebastian Scherer, László A. Jeni
EditCtrl: Disentangled Local and Global Control for Real-Time Generative Video Editing
PDF
Yehonathan Litman, Shikun Liu, Dario Seyb, Nicholas Milef, Yang Zhou, Carl Marshall, Shubham Tulsiani, Caleb Leak
Unsupervised Multi-Scale Segmentation of 3D Subcellular World with Stable Diffusion Foundation Model
PDF
Mostofa Rafid Uddin, HM Shadman Tabib, Thanh-Huy Nguyen, Kashish Gandhi, Min Xu
VT-Intrinsic: Physics-Based Decomposition of Reflectance and Shading using a Single Visible-Thermal Image Pair
Zeqing Leo Yuan, Mani Ramanagopal, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan
CGHair: Compact Gaussian Hair Reconstruction with Card Clustering
Haimin Luo, Srinjay Sarkar, Albert Mosella-Montoro, Francisco Vicente Carrasco, Fernando De la Torre
RefAV: Towards Planning-Centric Scenario Mining
PDF
Cainan Davidson, Deva Ramanan, Neehar Peri
DuoMo: Dual Motion Diffusion for World-Space Human Reconstruction
Yufu Wang, Evonne Ng, Soyong Shin, Rawal Khirodkar, Yuan Dong, Zhaoen Su, Jinhyung Park, Kris Kitani, Alexander Richard, Fabian Prada, Michael Zollhöfer
SAM 3D Body: Robust Full-Body Human Mesh Recovery
PDF
Xitong Yang, Devansh Kukreja, Don Pinkus, Taosha Fan, Jinhyung Park, Soyong Shin, Jinkun Cao, Jia-Wei Liu, Nicolás Ugrinovic, Anushka Sagar, Jitendra Malik, Matt Feiszli, Piotr Dollár, Kris Kitani
Building a Precise Video Language with Human–AI Oversight
Zhiqiu Lin, Chancharik Mitra, Siyuan Cen, Isaac Li, Yuhan Huang, Yu Tong Tiffany Ling, Hewei Wang, Irene Pi, Shihang Zhu, Yili Han, Yilun Du, Deva Ramanan
Dual Band Thermal Videography: Separating Time-Varying Reflection and Emission Near Ambient Conditions
PDF
Sriram Narayanan, Mani Ramanagopal, Srinivasa G. Narasimhan
MedLIME: A Distribution-Aligned and Evidence-Supported Framework for Medical Saliency Explanations
Raghav Magazine, Xingjian Li, Min Xu
E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training
PDF
Qitao Zhao, Hao Tan, Qianqian Wang, Sai Bi, Kai Zhang, Kalyan Sunkavalli, Shubham Tulsiani, Hanwen Jiang
Any4D: Unified Feed-Forward Metric 4D Reconstruction
PDF
Jay Karhade, Nikhil Keetha, Yuchen Zhang, Tanisha Gupta, Akash Sharma, Sebastian Scherer, Deva Ramanan
Ground Reaction Inertial Poser: Physics-based Human Motion Capture from Sparse IMUs and Insole Pressure Sensors
PDF
Ryosuke Hori, Jyun-Ting Song, Zhengyi Luo, Jinkun Cao, Soyong Shin, Hideo Saito, Kris Kitani
Computational Speckle Pattern Interferometry
Shengxi Wu, Sophia Yang, Dorian Chan, Matthew O'Toole
Grounded Latents for Entity-Centric 4D Scene Generation
Jinhyung Park, Navyata Sanghvi, Erica Weng, Shawn Hunt, Shinya Tanaka, Hironobu Fujiyoshi, Kris Kitani
Co-Me: Confidence-Guided Token Merging for Visual Geometric Transformers
PDF
Yutian Chen, Yuheng Qiu, Ruogu Li, Ali Agha, Shayegan Omidshafiei, Jay Patrikar, Sebastian Scherer
PhyCo: Learning Controllable Physical Priors for Generative Motion
PDF
Sriram Narayanan, Ziyu Jiang, Srinivasa Narasimhan, Manmohan Chandraker
FPSBench: A Benchmark for Video Understanding at High Frame Rates
Rohan Choudhury, Jean Sebastien Dandurand, Kai Qiu, Kshitij Madhav Bhat, Kartik Sharma, Liza Dahiya, Yizhou Zhao, Souraja Kundu, Chun-Hsien Lin, Kris Kitani, Laszlo A. Jeni
Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning
PDF
Zhongxiao Cong, Qitao Zhao, Minsik Jeon, Shubham Tulsiani
2D-LFM: Lifting Foundation Model without 3D Supervision
Mosam Dabhi, Irhas Gill, Laszlo A. Jeni, Simon Lucey
MicroFM: Physics-guided Flow Matching for Isotropic Microscopy Reconstruction
Xingzu Zhan, Runmin Jiang, Vatsal Gupta, Tanush Swaminathan, Yanwen Wang, Genpei Zhang, Haili Wang, Min Xu
DyaDiT: A Multi-Modal Diffusion Transformer for Socially-Aware Dyadic Gesture Generation
PDF
Yichen Peng, Jyun-Ting Song, Siyeol Jung, Ruofan Liu, Haiyang Liu, Xuangeng Chu, Ruicong Liu, Erwin Wu, Hideki Koike, Kris Kitani
OnlineHMR: Video-based Online World-Grounded Human Mesh Recovery
Yiwen Zhao, Ce Zheng, Yufu Wang, Hsueh-Han Daniel Yang, Liting Wen, Laszlo A. Jeni
Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization
PDF
Tsai-Shien Chen, Aliaksandr Siarohin, Guocheng Gordon Qian, Kuan-Chieh Jackson Wang, Egor Nemchinov, Moayed Haji-Ali, Riza Alp Guler, Willi Menapace, Ivan Skorokhodov, Anil Kag, Jun-Yan Zhu, Sergey Tulyakov
Multi-view Consistent 3D Gaussian Head Avatars 'without' Multi-view Generation
Aviral Chharia, Fernando De la Torre
New Papers at ICLR 2026 !
(8 papers)
RobotArena ∞: Scalable Robot Benchmarking via Real-to-Sim Translation
PDF
Yash Jangir, Yidi Zhang, Kashu Yamazaki, Chenyu Zhang, Kuan-Hsun Tu, Tsung-Wei Ke, Lei Ke, Yonatan Bisk, Katerina Fragkiadaki
MOSIV: Multi-Object System Identification from Videos
PDF
Chunjiang Liu, Xiaoyuan Wang, Qingran Lin, Albert Xiao, Haoyu Chen, Shizheng Wen, Hao Zhang, Lu Qi, Ming-Hsuan Yang, Laszlo A. Jeni, Min Xu, Yizhou Zhao
Contact-guided Real2Sim from Monocular Video with Planar Scene Primitives
PDF
Zihan Wang*, Jiashun Wang*, Jeff Tan, Yiwen Zhao, Jessica K. Hodgins, Shubham Tulsiani, Deva Ramanan
PAT3D: Physics-Augmented Text-to-3D Scene Generation
PDF
Guying Lin, Kemeng Huang, Michael Liu, Ruihan Gao, Hanke Chen, Lyuhao Chen, Beijia Lu, Taku Komura, Yuan Liu, Jun-Yan Zhu, Minchen Li
Learning an Image Editing Model without Image Editing Pairs
PDF
Nupur Kumari, Sheng-Yu Wang, Nanxuan Zhao, Yotam Nitzan, Yuheng Li, Krishna Kumar Singh, Richard Zhang, Eli Shechtman, Jun-Yan Zhu, Xun Huang
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling
PDF
Tal Daniel, Carl Qi, Dan Haramati, Amir Zadeh, Chuan Li, Aviv Tamar, Deepak Pathak, David Held
Scaling Group Inference for Diverse and High-Quality Generation
PDF
Gaurav Parmar, Or Patashnik, Daniil Ostashev, Kuan-Chieh (Jackson) Wang, Kfir Aberman, Srinivasa Narasimhan, Jun-Yan Zhu
MotionStream: Real-Time Video Generation with Interactive Motion Controls
PDF
Joonghyuk Shin, Zhengqi Li, Richard Zhang, Jun-Yan Zhu, Jaesik Park, Eli Shechtman, Xun Huang