I am a Ph.D. candidate in the Department of Computer Science at the National University of Singapore, advised by Prof. Wei Tsang Ooi, Prof. Benoit Cottereau, and Dr. Lai Xing Ng. I also collaborate closely with Prof. Ziwei Liu from Nanyang Technological University, Singapore.
My research focuses include spatial intelligence, multimodal vision-language models, and 3D/4D world modeling and evaluations.
I have been fortunate to collaborate with Apple Machine Learning Research, NVIDIA Research, ByteDance AI Lab, OpenMMLab, MMLab@NTU, and Motional.
I am the recipient of the National Scholarship (Ministry of Education, 2019), Research Achievement Award (NUS Computing, 2023), Dean's Graduate Research Excellence Award (NUS Computing, 2024), DAAD AInet Fellowship (DAAD, 2025), and Apple Scholars in AI/ML Ph.D. Fellowship (Apple, 2025).
![]() |
Apple AI/ML |
![]() |
CNRS@CREATE |
![]() |
NVIDIA Research |
![]() |
Shanghai AI Laboratory |
![]() |
ByteDance AI Lab |
![]() |
OpenMMLab |
![]() |
Motional |
* equal contributions ‡ project lead § corresponding author
![]() |
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives
arXiv, 2025
|
![]() |
FlexEvent: Event Camera Object Detection at Varying Frequencies
arXiv, 2025
|
![]() |
Stairway to Success: Zero-Shot Floor-Aware Object-Goal Navigation via LLM-Driven Coarse-to-Fine Exploration
arXiv, 2025
|
![]() |
OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies
arXiv, 2025
|
![]() |
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving
arXiv, 2025
|
![]() |
EventFly: Event Camera Perception from Ground to the Sky |
![]() |
LiMoE: Mixture of LiDAR Data Representation Learners from Automotive Scenes |
![]() |
GEAL: Generalizable 3D Object Affordance Learning with Cross-Modal Consistency |
![]() |
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding |
![]() |
PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning |
![]() |
DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes ![]() |
![]() |
Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding ![]() |
![]() |
Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving |
![]() |
FRNet: Frustum-Range Networks for Scalable LiDAR-Based Semantic Segmentation |
![]() |
NUC-Net: Non-Uniform Cylindrical Partition Networks for Efficient LiDAR Semantic Segmentation |
![]() |
Is Your LiDAR Placement Optimized for 3D Scene Understanding? ![]() |
![]() |
Is Your HD Map Constructor Reliable under Sensor Corruptions? |
![]() |
4D Contrastive Superflows are Dense 3D Representation Learners |
![]() |
Learning to Adapt SAM for Segmenting Cross-Domain Point Clouds |
![]() |
OpenESS: Event-Based Semantic Scene Understanding with Open Vocabularies ![]() |
![]() |
Multi-Space Alignments Towards Universal LiDAR Segmentation |
![]() |
Unified 3D and 4D Panoptic Segmentation via Dynamic Shifting Networks |
![]() |
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving |
![]() |
RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions |
![]() |
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models ![]() |
![]() |
Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective |
![]() |
Towards Label-Free Scene Understanding by Vision Foundation Models |
![]() |
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions |
![]() |
Rethinking Range View Representation for LiDAR Segmentation |
![]() |
UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase |
![]() |
LaserMix for Semi-Supervised LiDAR Semantic Segmentation ![]() |
![]() |
CLIP2Scene: Towards Label-Efficient 3D Scene Understanding by CLIP |
![]() |
ConDA: Unsupervised Domain Adaptation for LiDAR Segmentation via Regularized Domain Concatenation |
![]() |
Benchmarking 3D Robustness to Common Corruptions and Sensor Failure ![]() |
![]() |
The RoboSense Challenge: Robot Sensing under Challenging Conditions
Technical Report, 2025
|
![]() |
The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Technical Report, 2024
|
![]() |
The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation
Technical Report, 2023
|