I am a Ph.D. candidate in the Department of Computer Science at the National University of Singapore, advised by Prof. Wei Tsang Ooi, Prof. Benoit Cottereau, and Dr. Lai Xing Ng. I also collaborate closely with Prof. Ziwei Liu from Nanyang Technological University, Singapore.
My research focuses on developing 3D scene understanding systems that are robust, scalable, and generalizable in real-world conditions.
I have been fortunate to collaborate with Apple Machine Learning Research, NVIDIA Research, ByteDance Seed, ByteDance AI Lab, OpenMMLab, MMLab@NTU, and Motional.
I am the recipient of the National Scholarship (Ministry of Education, 2019), NUS Research Achievement Award (NUS Computing, 2023), and Dean's Graduate Research Excellence Award (NUS Computing, 2024),
🦁 I am open to discussion and collaboration in 3D scene perception, generation, and understanding. If you find our research backgrounds a potential match, feel free to email me.
![]() |
NVIDIA Research |
![]() |
Shanghai AI Laboratory |
![]() |
ByteDance AI Lab |
![]() |
OpenMMLab |
![]() |
Motional |
* equal contributions ‡ project lead § corresponding author
![]() |
DynamicCity: Large-Scale Occupancy Generation from Dynamic Scenes ![]() |
![]() |
Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding ![]() |
![]() |
Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving |
![]() |
LiMoE: Mixture of LiDAR Representation Learners from AutomotiveScenes
arXiv, 2025
|
![]() |
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives
arXiv, 2025
|
![]() |
FlexEvent: Event Camera Object Detection at Arbitrary Frequencies
arXiv, 2025
|
![]() |
GEAL: Generalizable 3D Object Affordance Learning with Cross-Modal Consistency
arXiv, 2025
|
![]() |
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
arXiv, 2025
|
![]() |
OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies
arXiv, 2025
|
![]() |
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving
arXiv, 2025
|
![]() |
Is Your LiDAR Placement Optimized for 3D Scene Understanding? ![]() |
![]() |
Is Your HD Map Constructor Reliable under Sensor Corruptions? |
![]() |
4D Contrastive Superflows are Dense 3D Representation Learners |
![]() |
Learning to Adapt SAM for Segmenting Cross-Domain Point Clouds |
![]() |
OpenESS: Event-Based Semantic Scene Understanding with Open Vocabularies ![]() |
![]() |
Multi-Space Alignments Towards Universal LiDAR Segmentation |
![]() |
Unified 3D and 4D Panoptic Segmentation via Dynamic Shifting Networks |
![]() |
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving |
![]() |
RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions |
![]() |
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models ![]() |
![]() |
Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective |
![]() |
Towards Label-Free Scene Understanding by Vision Foundation Models |
![]() |
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions |
![]() |
Rethinking Range View Representation for LiDAR Segmentation |
![]() |
UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase |
![]() |
LaserMix for Semi-Supervised LiDAR Semantic Segmentation ![]() |
![]() |
CLIP2Scene: Towards Label-Efficient 3D Scene Understanding by CLIP |
![]() |
ConDA: Unsupervised Domain Adaptation for LiDAR Segmentation via Regularized Domain Concatenation |
![]() |
Benchmarking 3D Robustness to Common Corruptions and Sensor Failure ![]() |
![]() |
The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Technical Report, 2024
|
![]() |
The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation
Technical Report, 2023
|