Sifei Liu

@nvidia.com

NVIDIA LPR
NVIDIA

Sifei Liu

EDUCATION

PhD, University of California, Merced

RESEARCH INTERESTS

Computer Vision, Machine Learning
78

Scopus Publications

12736

Scholar Citations

48

Scholar h-index

81

Scholar i10-index

Scopus Publications

RECENT SCHOLAR PUBLICATIONS

  • Context-aware synthesis and placement of object instances
    D Lee, S Liu, J Gu, MY Liu, J Kautz
    US Patent App. 19/433,543 , 2026
    2026
  • Scaling rl to long videos
    Y Chen, W Huang, B Shi, Q Hu, H Ye, L Zhu, Z Liu, P Molchanov, J Kautz, ...
    Advances in Neural Information Processing Systems 38, 172842-172870 , 2026
    2026
    Citations: 72
  • Diffusion-based open-vocabulary segmentation
    J Xu, S De Mello, S Liu, A Vahdat, W Byeon
    US Patent 12,586,199 , 2026
    2026
    Citations: 8
  • Compositional 3d-consistent freeview image generation with 3d blobs
    C Liu, W Nie, S Liu, AH Badki, H Su, M Mardani, BD Eckart, A Vahdat
    US Patent App. 19/227,222 , 2026
    2026
  • Techniques for fine-tuning a machine learning model to reconstruct a three-dimensional scene
    Y Fu, S Liu, J Kautz, X Li, S De Mello, A Kulkarni, M Naphade
    US Patent 12,548,234 , 2026
    2026
    Citations: 2
  • Techniques for training a machine learning model to reconstruct different three-dimensional scenes
    Y Fu, S Liu, J Kautz, X Li, S De Mello, A Kulkarni, M Naphade
    US Patent 12,548,258 , 2026
    2026
  • Learnable fourier series for image restoration
    S Liu, S De Mello, J Kautz
    US Patent App. 18/975,124 , 2026
    2026
  • Training and inferencing using a neural network to predict orientations of objects in images
    SK Mustikovela, V Jampani, S De Mello, S Liu, U Iqbal, J Kautz
    US Patent App. 19/094,621 , 2025
    2025
  • Context-aware synthesis and placement of object instances
    D Lee, S Liu, J Gu, MY Liu, J Kautz
    US Patent 12,462,453 , 2025
    2025
    Citations: 1
  • Segmentation using an unsupervised neural network training technique
    V Jampani, WC Hung, S Liu, P Molchanov, J Kautz
    US Patent 12,450,748 , 2025
    2025
  • Token-Efficient VLM: High-Resolution Image Understanding Via Dynamic Region Proposal
    Y Jiang, J Gu, T Xue, KC Cheung, P Molchanov, H Yin, S Liu
    2025 IEEE/CVF International Conference on Computer Vision (ICCV), 24147-24158 , 2025
    2025
    Citations: 5
  • OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
    H Ye, CHH Yang, A Goel, W Huang, L Zhu, Y Su, S Lin, AC Cheng, Z Wan, ...
    arXiv preprint arXiv:2510.15870 , 2025
    2025
    Citations: 10
  • QeRL: Beyond Efficiency--Quantization-enhanced Reinforcement Learning for LLMs
    W Huang, Y Ge, S Yang, Y Xiao, H Mao, Y Lin, H Ye, S Liu, KC Cheung, ...
    arXiv preprint arXiv:2510.11696 , 2025
    2025
    Citations: 7
  • Compositional text-to-image generation with dense blob representations
    W Nie, S Liu, MM Korani, C Liu, BD Eckart, A Vahdat
    US Patent App. 18/889,975 , 2025
    2025
  • 3d aware region prompted vision language model
    AC Cheng, Y Fu, Y Chen, Z Liu, X Li, S Radhakrishnan, S Han, Y Lu, ...
    arXiv preprint arXiv:2509.13317 , 2025
    2025
    Citations: 19
  • Region-aware vision language processor
    Q Guo, S De Mello, H Yin, W Byeon, KC Cheung, SCW See, J Kautz, ...
    US Patent App. 19/065,367 , 2025
    2025
  • Machine learning framework applied in a semi-supervised setting to perform instance tracking in a sequence of image frames
    Y Fu, S Liu, U Iqbal, S De Mello, J Kautz
    US Patent 12,400,341 , 2025
    2025
    Citations: 1
  • Sse: Multimodal semantic data selection and enrichment for industrial-scale data assimilation
    M Shen, N Chang, S Liu, JM Alvarez
    Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and … , 2025
    2025
    Citations: 4
  • Egovla: Learning vision-language-action models from egocentric human videos
    R Yang, Q Yu, Y Wu, R Yan, B Li, AC Cheng, X Zou, Y Fang, X Cheng, ...
    arXiv preprint arXiv:2507.12440 , 2025
    2025
    Citations: 72
  • View synthesis using camera poses learned from a video
    Y Fu, S Liu, A Kulkarni, J Kautz
    US Patent App. 18/963,075 , 2025
    2025
    Citations: 1

MOST CITED SCHOLAR PUBLICATIONS

  • Learning continuous image representation with local implicit image function
    Y Chen, S Liu, X Wang
    Proceedings of the IEEE/CVF conference on computer vision and pattern … , 2021
    2021
    Citations: 1182
  • A face antispoofing database with diverse attacks
    Z Zhang, J Yan, S Liu, Z Lei, D Yi, SZ Li
    2012 5th IAPR international conference on Biometrics (ICB), 26-31 , 2012
    2012
    Citations: 1120
  • Groupvit: Semantic segmentation emerges from text supervision
    J Xu, S De Mello, S Liu, W Byeon, T Breuel, J Kautz, X Wang
    Proceedings of the IEEE/CVF conference on computer vision and pattern … , 2022
    2022
    Citations: 868
  • Generative face completion
    Y Li, S Liu, J Yang, MH Yang
    Proceedings of the IEEE conference on computer vision and pattern … , 2017
    2017
    Citations: 849
  • Open-vocabulary panoptic segmentation with text-to-image diffusion models
    J Xu, S Liu, A Vahdat, W Byeon, X Wang, S De Mello
    Proceedings of the IEEE/CVF conference on computer vision and pattern … , 2023
    2023
    Citations: 752
  • Low-light image enhancement via a deep hybrid network
    W Ren, S Liu, L Ma, Q Xu, X Xu, X Cao, J Du, MH Yang
    IEEE Transactions on Image Processing 28 (9), 4364-4375 , 2019
    2019
    Citations: 592
  • Spatialrgpt: Grounded spatial reasoning in vision-language models
    AC Cheng, H Yin, Y Fu, Q Guo, R Yang, J Kautz, X Wang, S Liu
    Advances in Neural Information Processing Systems 37, 135062-135093 , 2024
    2024
    Citations: 431
  • Learning affinity via spatial propagation networks
    S Liu, S De Mello, J Gu, G Zhong, MH Yang, J Kautz
    Advances in Neural Information Processing Systems 30 , 2017
    2017
    Citations: 372
  • Learning linear transformations for fast image and video style transfer
    X Li, S Liu, J Kautz, MH Yang
    Proceedings of the IEEE/CVF conference on computer vision and pattern … , 2019
    2019
    Citations: 338
  • COLMAP-Free 3D Gaussian Splatting
    Y Fu, S Liu, A Kulkarni, J Kautz, AA Efros, X Wang
    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern … , 2024
    2024
    Citations: 307
  • Self-supervised single-view 3d reconstruction via semantic consistency
    X Li, S Liu, K Kim, S De Mello, V Jampani, MH Yang, J Kautz
    European Conference on Computer Vision, 677-693 , 2020
    2020
    Citations: 307
  • Deep cascaded bi-network for face hallucination
    S Zhu, S Liu, CC Loy, X Tang
    European conference on computer vision, 614-630 , 2016
    2016
    Citations: 297
  • Learning dual convolutional neural networks for low-level vision
    J Pan, S Liu, D Sun, J Zhang, Y Liu, J Ren, Z Li, J Tang, H Lu, YW Tai, ...
    Proceedings of the IEEE conference on computer vision and pattern … , 2018
    2018
    Citations: 266
  • Semi-supervised 3d hand-object poses estimation with interactions in time
    S Liu, H Jiang, J Xu, S Liu, X Wang
    Proceedings of the IEEE/CVF conference on computer vision and pattern … , 2021
    2021
    Citations: 256
  • Learning recursive filters for low-level vision via a hybrid neural network
    S Liu, J Pan, MH Yang
    European conference on computer vision, 560-576 , 2016
    2016
    Citations: 211
  • Joint-task self-supervised learning for temporal correspondence
    X Li, S Liu, S De Mello, X Wang, J Kautz, MH Yang
    Advances in Neural Information Processing Systems 32 , 2019
    2019
    Citations: 209
  • Scops: Self-supervised co-part segmentation
    WC Hung, V Jampani, S Liu, P Molchanov, MH Yang, J Kautz
    Proceedings of the IEEE/CVF conference on computer vision and pattern … , 2019
    2019
    Citations: 204
  • No pose, no problem: Surprisingly simple 3d gaussian splats from sparse unposed images
    B Ye, S Liu, H Xu, X Li, M Pollefeys, MH Yang, S Peng
    International Conference on Learning Representations 2025, 54009-54033 , 2025
    2025
    Citations: 194
  • Synthesizing long-term 3d human motion and interaction in 3d scenes
    J Wang, H Xu, J Xu, S Liu, X Wang
    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern … , 2021
    2021
    Citations: 192
  • Nvila: Efficient frontier visual language models
    Z Liu, L Zhu, B Shi, Z Zhang, Y Lou, S Yang, H Xi, S Cao, Y Gu, D Li, X Li, ...
    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern … , 2025
    2025
    Citations: 190